Senior High Performance Computing System Administrator jobs in United States
cer-icon
Apply on Employer Site
company-logo

Yale University · 1 month ago

Senior High Performance Computing System Administrator

Yale University is seeking a Senior High Performance Computing System Administrator to support their AI HPC infrastructure for research and scholarship. The role involves system design, deployment, and support of AI-focused research clusters, with a strong emphasis on GPU infrastructure enhancements.

AssociationBusiness DevelopmentEducationMedicalSocial Entrepreneurship
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Design, implement and advance core HPC systems such as the HPC provisioning system, the resource-management system, account/user lifecycle management, and user authentication and authorization systems
Design, deploy, configure and support HPC clusters, including compute, networking, parallel storage and backup
Install, administer and maintain hardware, system software, networking, accounts, and security measures
Diagnose and correct system issues, whether these be issues with correct operation or performance
Develop and maintain documentation
Research developments in HPC architecture and new technologies, processes, and methodologies
Determine specifications for new systems, and tailor these to meet research needs

Qualification

GPU expertiseHPC Linux clustersHigh-speed networkingLarge storage systemsLinux system administrationAutomationScriptingMulti-node GPU systemsParallel file systemsComputer securityVerbalData-center experienceProfessional certificationsGraduate degreeTeam collaborationWriting skillsAttention to detailIndependent work

Required

Experience with accelerators such as GPUs for AI, including expertise with system-level tradeoffs in such areas as accelerator-based memory, precision, within-node interconnect, multi-node interconnect, cost and power consumption
Expertise in administration of HPC Linux clusters, including managing and configuring cluster provisioning and management tools, and batch scheduler
Experience with high-speed networking such as InfiniBand and high-speed Ethernet
Experience with large storage systems and parallel file systems such as GPFS and Lustre
Expertise in Linux system administration, including managing the operating system, networking, storage, and security
Expertise in automation and scripting in at least one scripting language
Ability to work in a team environment in a fast-moving technology field. Excellent verbal and writing skills
Ability to interact well with team members and end users. Ability to work independently and across units
Attention to detail. Ability to take the care necessary to be entrusted with a system that hundreds of users depend on for research computation and the storage of research data
Bachelor's Degree in a related field and a minimum of six years of related work experience or an equivalent combination of education and experience

Preferred

Demonstrated ability to specify, install, configure, and support multi-node GPU systems, and tune them for AI applications
Demonstrated ability to design, implement, and maintain a local, customized implementation and configuration of a core HPC system such as the HPC provisioning system, the resource-management system, account/user lifecycle management, or user authentication and authorization systems
Experience supporting technology in a research environment
Expertise in configuration, deployment, support, and backup of large-scale parallel storage systems
Experience administering high-speed networking such as InfiniBand or high-speed Ethernet in a cluster environment
Expertise in computer security, preferably in the context of large, multi-user Linux environments
Experience in a data-center environment, installing and trouble-shooting hardware
Professional certifications related to the above
Graduate degree in a related field

Company

Yale University

company-logo
Yale University is a research and education institution that prioritizes its students.

H1B Sponsorship

Yale University has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (355)
2024 (449)
2023 (214)
2022 (208)
2021 (190)
2020 (155)

Funding

Current Stage
Late Stage
Total Funding
$42.86M
Key Investors
Bezos Earth FundAlfred P. Sloan FoundationHyundai Hope On Wheels
2025-10-23Grant
2025-05-21Grant· $0.05M
2023-01-01Grant· $1.27M

Leadership Team

leader-logo
Geoffrey Chatas
Senior Vice President for Operations
leader-logo
Jack Callahan
Senior Vice President for Operations
linkedin
Company data provided by crunchbase