Lorven Technologies Inc. · 2 hours ago
Sr HPC Administrator
Lorven Technologies Inc. is seeking a Senior HPC Administrator to manage a computational and data science ecosystem for researchers at Mount Sinai. The role involves overseeing high-performance computing systems, clinical research databases, and ensuring compliance with regulations while providing excellent customer service to researchers.
Responsibilities
Design, deploy and maintain Scientific Computing’s computational and data science ecosystem including ~30,000 cores with high bandwidth, low latency interconnects, GPUs, large shared memory nodes, databases, scientific workflows and 30+ petabytes of storage in production, clinical data warehouse and software development environment
Lead the troubleshooting, isolation and resolution of all technical issues including application, system, hardware, software, and network). Actively monitors the systems
Maintains, tunes and manages computational, data, cloud technologies and workflow systems for ISMMS researchers, scientists and their external collaborators. Defines and deploys a comprehensive computational and data vision. Identifies and communicates system advantages/disadvantages and tradeoffs
Designs, develops, implements system administration tasks, including hardware and software configuration, configuration management, system monitoring (including the development and maintenance of regression tests), usage reporting, system performance (file systems, scheduler, interconnect, high availability, etc.), security, networking and metrics, etc
Collaborates effectively with research and hospital system IT, compliance, HIPAA, security and other departments to ensure compliance with all regulations and Sinai policies
Participates in the integration of HPC resources with laboratory equipment such as sequencers, clinical and research data resources and systems, etc. Incorporate and link data and compute resources
Researches, deploys and optimizes resource management and scheduling software and policies and actively monitoring. Designs, tunes, manages and upgrades parallel file systems, storage and data-oriented resources
Researches, deploys and manages security infrastructure, including development of policies and procedures
Maintain all necessary aspects of HPC in accordance with best practices. Develops and implements backup policies
Prepares and manages budgets for hardware, software and maintenance. Participates in chargeback/fee recovery analysis and provides suggestions to make operations sustainable
Assists in developing and writing system design for research proposals. Creates and provides clear documentation
Works effectively and productively with other team members within the group and across Mount Sinai
Performs related duties as assigned or requested
Provides after hours support for critical system and production issues
Answers and resolves user tickets
Qualification
Required
Bachelor's degree in computer science, engineering or another scientific field
8+ years (higher preferred) of progressive HPC system administration and operations (preferably in a Redhat/CentOS Linux administration, Batch HPC cluster environment)
Must be an expert troubleshooter; Must be a team player and customer focused
Experience with job scheduler such as LSF or Slurm and parallel file systems and storage
Experience with networking and security
Experience with configuration management systems such as xCAT, Puppet and/or Ansible
Experience of databases and web services
Experience in Infiniband, Gigabit Ethernet
Experience in an academic or research community environment
Script and programming experience
Experience with Cloud Computing
Ability to multitask effectively in a dynamic environment
Excellent communication skills, analytical ability, strong judgment and management skills, and the ability to work effectively as a liaison between both research and technology teams
Strong written, oral, and interpersonal communication skills
Preferred
Advanced degree
Experience with GPFS, LSF, TSM, IB and ethernet networking
Experience with databases and web services is highly preferred
Company
Lorven Technologies Inc.
Lorven Technologies, Inc. is a highly recognized provider of professional technology consultancy in the US.
H1B Sponsorship
Lorven Technologies Inc. has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (11)
2024 (11)
2023 (13)
2022 (14)
2021 (12)
2020 (17)
Funding
Current Stage
Late StageCompany data provided by crunchbase