Harvard Medical School · 1 day ago
High Performance Computing Team Lead
Harvard Medical School is dedicated to improving health and well-being through excellence in teaching, discovery, and service. The High Performance Computing Team Lead will lead the design and operation of HPC environments, ensuring scalable and secure research workflows while overseeing user access and software environments. This role involves collaborating across teams to maintain a robust foundation for computational activities and contributing to HPC infrastructure initiatives.
Responsibilities
Oversee the design, provisioning, configuration, and decommissioning of HPC compute clusters, ensuring system performance and lifecycle sustainability
Engineer, administer and tune workload schedulers (e.g., Slurm) and cluster management to optimize job throughput, resource utilization, and system availability
Design, maintain and support secure, regulated compute environments (e.g. NIST 800-171), ensuring technical safeguards and documentation align with required frameworks necessary for enabling regulated biomedical research
Ensure integration and design of user accounts and identity management with institutional systems, supporting secure and streamlined access to HPC resources
Design and maintain customized and sustainable researcher software environments, including module systems and containerized applications within security standards
Lead team in the software development life cycle for operational tooling and infrastructure automation and deliver expert coding
Research, design, and implement technical solutions to meet infrastructure and research requirements
Identify opportunities to improve and simplify compute platform services and implement related enhancements
Contribute to the creation and maturing of operational and automation best practices, including Service Level Agreements
Act as a technical liaison to internal and external stakeholders and collaborators and mentor junior staff
Participate in off hours on-call schedule
Other duties as assigned
Qualification
Required
Minimum of seven years' post-secondary education or relevant work experience
Minimum of 5 years of experience managing Linux-based HPC systems in a research or academic environment
Strong experience with workload schedulers (Slurm preferred), cluster provisioning, and performance tuning
Experience with infrastructure monitoring, configuration management tools (e.g., Ansible), and containerization tools (e.g., Singularity/Apptainer, Docker)
Familiarity with security and compliance requirements in regulated research environments
Excellent troubleshooting, communication, and collaboration skills
Ability to work collaboratively in a team and adapt to evolving technologies and priorities
Excellent interpersonal skills, including the ability to build and cultivate strong relationships and work effectively with diverse groups
Demonstrated “can do” work ethic coupled with effective time management
Benefits
Generous paid time off including parental leave
Medical, dental, and vision health insurance coverage starting on day one
Retirement plans with university contributions
Wellbeing and mental health resources
Support for families and caregivers
Professional development opportunities including tuition assistance and reimbursement
Commuter benefits, discounts and campus perks
Company
Harvard Medical School
At Harvard Medical School, our mission is to create and nurture a diverse community of the best people committed to leadership in alleviating human suffering caused by disease.
Funding
Current Stage
Late StageLeadership Team
Recent News
2025-07-17
2025-04-07
2025-03-28
Company data provided by crunchbase