University of Colorado Boulder · 2 hours ago
Senior Technical Lead of Research Infrastructure
The University of Colorado Boulder is seeking a Senior Technical Lead of Research Infrastructure to provide technical leadership and hands-on expertise for research computing infrastructure. This role involves mentoring team members, translating architectural direction into practical implementations, and managing HPC and storage systems while ensuring system reliability and operational improvements.
Universities
Responsibilities
The Senior Technical Lead translates architectural direction into hands-on infrastructure solutions, serving as the team's primary technical escalation point when complex HPC and storage challenges arise
This role shapes day-to-day technical decision-making for infrastructure operations and improvements, while establishing and maintaining the technical standards, procedures, and standard practices that guide the team's systems work
The position tackles sophisticated multi-system issues that span infrastructure domains and champions automation, monitoring, and operational improvements that strengthen system reliability
This position performs hands-on administration of HPC clusters, storage systems (ZFS, RAID, GPFS, Lustre), and parallel computing infrastructure, leading complex system changes, upgrades, and optimizations
This role conducts hardware repairs, OS configuration (Linux/Unix), and software updates while optimizing system performance, resource utilization, and data-transfer capabilities (Globus)
The position manages compute resources and job schedulers (SLURM), automates infrastructure provisioning through configuration management tools (Ansible, Puppet, Chef), and develops monitoring and observability platforms (Nagios, Grafana) to maintain system reliability
The Senior Technical Lead mentors HPC and Storage System Administrators on technical skills and problem-solving approaches, providing hands-on guidance during complex implementations and troubleshooting
This role develops team capabilities through pairing, code reviews, and guided learning while building team confidence to handle infrastructure challenges independently
The position coaches team members on user documentation and knowledge-sharing, supports cross-training initiatives to reduce single points of failure, and champions a collaborative problem-solving culture within the RIT team
The Senior Technical Lead maintains technical runbooks, procedures, and troubleshooting guides while documenting system configurations and implementation details
This role creates and updates architectural diagrams for team reference, work with the team to build knowledge base and wiki, and conducts technical knowledge-sharing sessions for the RIT team
The Senior Technical Lead coordinates with User Support (UST), Data Center Operations (DCOPS) and other teams on technical issues, participates in sprint planning and Agile processes, and provides technical input on infrastructure planning and vendor evaluations
This role supports the Associate Director with technical assessments and recommendations, and advises researchers on optimal infrastructure use when brought up
The position is expected to use open source and community projects to enhance infrastructure capabilities
This position will maintain professional expertise in the field by reviewing trade publications lists, studying the latest vendor trends, reviewing pertinent mailing lists and attending seminars, training sessions and conferences
This position will identify and elect relevant training opportunities that would be most beneficial to the organization
Qualification
Required
Bachelor's Degree in Computer Science, Computer Engineering, Engineering or related field. A combination of education and relevant experience as described below may be substituted for a degree on a year-for-year basis
5+ years experience in IT infrastructure administration with deep technical depth in HPC or storage systems
5+ years experience working with HPC clusters or large-scale research storage environments
3+ years in a senior technical role with mentorship or technical leadership responsibilities
Exceptional ability to work effectively both within a team and independently, as circumstances warrant
Ability to follow through with assignments and commitments in a timely and professional manner
Ability to work from a set of requirements to build complex computing systems
Ability to develop and advocate independent solutions and system designs
Experience in system and related network administration of complex computer systems, specifically Linux systems and preferably Linux clusters
Experience diagnosing and repairing computer hardware
Experience with batch queueing systems, preferably Slurm
Experience with network interconnects (e.g., Intel Omni-Path Architecture, Mellanox InfiniBand, RoCE)
Preferred
Master's Degree or equivalent experience
Background in higher education or research environments
Linux Professional Institute (LPIC) or Red Hat Certified Engineer (RHCE)
Demonstrated experience in system and related network administration of complex computer systems, specifically Linux systems and preferably Linux clusters
Demonstrated experience with an on-premises clustered virtualization environment. For example: OpenStack (preferred), OpenNebula, Ovirt, Proxmox
Experience with Linux network file systems, particularly NFSv3
Benefits
Medical
Dental
Retirement plans
Generous paid time off
Tuition assistance for you and your dependents
ECO Pass for local transit
Company
University of Colorado Boulder
University of Colorado Boulder is a bold, innovative community of scholars and learners who accelerate human potential
Funding
Current Stage
Late StageTotal Funding
$29.19MKey Investors
US Department of Commerce, Economic Development AdministationU.S. Environmental Protection AgencyNational Science Foundation
2023-10-10Grant· $1.4M
2022-06-28Grant· $0.03M
2021-09-29Grant· $22M
Recent News
2025-12-30
Sports Business Journal
2025-12-30
2025-12-19
Company data provided by crunchbase