Senior High Performance Computing System Administrator jobs in United States
cer-icon
Apply on Employer Site
company-logo

Icahn School of Medicine at Mount Sinai · 6 hours ago

Senior High Performance Computing System Administrator

Icahn School of Medicine at Mount Sinai is a leading academic medical system in the New York metro area, and they are seeking a Senior High Performance Computing System Administrator. The role involves managing a high-performance computational and data ecosystem to support researchers, ensuring compliance with regulations, and leading troubleshooting efforts.

Hospital & Health Care
check
H1B Sponsor Likelynote
Hiring Manager
Sandra Sterthous, SHRM-CP
linkedin

Responsibilities

Design, deploy and maintain Scientific Computing’s computational and data science ecosystem including ~30,000 cores with high bandwidth, low latency interconnects, GPUs, large shared memory nodes, databases, scientific workflows and 30+ petabytes of storage in production, clinical data warehouse and software development environment
Lead the troubleshooting, isolation and resolution of all technical issues including application, system, hardware, software, and network). Actively monitors the systems
Maintains, tunes and manages computational, data, cloud technologies and workflow systems for ISMMS researchers, scientists and their external collaborators. Defines and deploys a comprehensive computational and data vision. Identifies and communicates system advantages/disadvantages and tradeoffs
Designs, develops, implements system administration tasks, including hardware and software configuration, configuration management, system monitoring (including the development and maintenance of regression tests), usage reporting, system performance (file systems, scheduler, interconnect, high availability, etc.), security, networking and metrics, etc
Collaborates effectively with research and hospital system IT, compliance, HIPAA, security and other departments to ensure compliance with all regulations and Sinai policies
Participates in the integration of HPC resources with laboratory equipment such as sequencers, clinical and research data resources and systems, etc. Incorporate and link data and compute resources
Researches, deploys and optimizes resource management and scheduling software and policies and actively monitoring. Designs, tunes, manages and upgrades parallel file systems, storage and data-oriented resources
Researches, deploys and manages security infrastructure, including development of policies and procedures
Maintain all necessary aspects of HPC in accordance with best practices. Develops and implements backup policies
Prepares and manages budgets for hardware, software and maintenance. Participates in chargeback/fee recovery analysis and provides suggestions to make operations sustainable
Assists in developing and writing system design for research proposals. Creates and provides clear documentation
Works effectively and productively with other team members within the group and across Mount Sinai
Performs related duties as assigned or requested
Provides after hours support for critical system and production issues
Answers and resolves user tickets

Qualification

HPC system administrationRedhat/CentOS LinuxJob scheduler (LSF/Slurm)Configuration management (xCAT/Puppet/Ansible)NetworkingSecurityParallel file systemsDatabasesWeb servicesCloud ComputingTroubleshootingCustomer serviceAnalytical abilityCommunication skillsTeam player

Required

Bachelor's degree in computer science, engineering or another scientific field
8+ years (higher preferred) of progressive HPC system administration and operations (preferably in a Redhat/CentOS Linux administration, Batch HPC cluster environment)
Must be an expert troubleshooter; Must be a team player and customer focused
Experience with job scheduler such as LSF or Slurm and parallel file systems and storage
Experience with networking and security
Experience with configuration management systems such as xCAT, Puppet and/or Ansible
Experience of databases and web services
Experience in Infiniband, Gigabit Ethernet
Experience in an academic or research community environment
Script and programming experience
Experience with Cloud Computing
Ability to multitask effectively in a dynamic environment
Excellent communication skills, analytical ability, strong judgment and management skills, and the ability to work effectively as a liaison between both research and technology teams
Strong written, oral, and interpersonal communication skills

Preferred

Advanced degree
Experience with GPFS, LSF, TSM, IB and ethernet networking
Experience with databases and web services is highly preferred

Company

Icahn School of Medicine at Mount Sinai

company-logo
The Icahn School of Medicine at Mount Sinai is an international leader in medical and scientific training, biomedical research, and patient care.

H1B Sponsorship

Icahn School of Medicine at Mount Sinai has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (285)
2024 (298)
2023 (324)
2022 (294)
2021 (276)
2020 (305)

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Charles Powell MD MBA
System Chief, Division of Pulmonary, Critical Care & Sleep Medicine | CEO Respiratory Institute
linkedin
leader-logo
Angela J. Lamb, MD
Chief Technology Officer, Department of Dermatology
linkedin
Company data provided by crunchbase