Berkeley Lab · 21 hours ago
HPC/AI Programming Environment Engineer
Lawrence Berkeley National Lab's NERSC Division is seeking an HPC/AI Programming Environment Engineer to enhance the AI and HPC software environment on their flagship systems. The role involves developing and supporting software frameworks for HPC/AI workloads, collaborating with vendors, and ensuring programming environments meet scientific workflow needs.
Research
Responsibilities
Develop, integrate, and support software frameworks and tools that enable HPC/AI workloads within the NERSC HPC software environment on Perlmutter, Doudna, and future systems
Enable and optimize software environment technologies, including runtime integration, testing, and development of advanced capabilities for Doudna (NERSC-10), a state-of-the-art NVIDIA Vera Rubin system integrated by Dell
Serve as a liaison with NESAP science teams to understand workflow requirements and ensure programming environments meet the needs of scientific workloads
Collaborate with vendors to prioritize, develop, and enhance their technologies to meet the needs of DOE Office of Science application codes and workflows
Evaluate emerging technologies for their applicability to NERSC's scientific workloads
Measure and analyze performance and scalability of software frameworks and runtimes on current and future platforms
Contribute software engineering expertise to cross-team NERSC activities and collaborate across Berkeley Lab and the DOE Office of Science community
Prepare technical documentation, reports, papers, presentations and training materials describing significant results for dissemination within NERSC and the broader research community
Work directly with scientists and developers to ensure correct and optimal usage of software technologies and ensure requirements are met by future development
Provide technical leadership and mentorship within the PEM group and across NERSC
Lead development and deployment efforts for major programming environment initiatives
Represent NERSC in vendor engagements, standards bodies, and the broader HPC community
Work with greater independence and drive strategy for areas of responsibility
Qualification
Required
Bachelor's degree in Computer Science, Computational Science, Physical Sciences, or related field with a minimum of 8 years of related experience; or Master's degree with 6 years of experience; or equivalent experience
Experience with HPC software stacks and/or AI/ML frameworks such as PyTorch, TensorFlow, JAX, or similar technologies
Experience with state-of-the-art languages, methods, and tools used to program, profile, and debug parallel scientific applications and workflows, such as MPI, OpenMP, CUDA, C++, Rust, Python, or Fortran
Knowledge of the Linux environment
Excellent written and oral communication skills
Demonstrated ability to work effectively as part of a cross-disciplinary team
Minimum of 12 years of related experience with a Bachelor's degree; or 8 years with a Master's degree; or equivalent experience
Track record of technical leadership or leading collaborative projects
Recognized expertise and established professional network in HPC or related fields
Preferred
Ph.D. in Computer Science, Computational Science, Physical Sciences, or related field
Experience with production HPC environments and deploying services at scale
Experience with high-performance interconnects and distributed communication libraries for HPC and AI workloads, such as MPI, NCCL, libfabric, or UCX
Experience with container technologies (e.g., Docker, Podman, Singularity/Apptainer) and their application in HPC environments
Experience with hardware and software technologies in emerging areas such as cloud computing, AI accelerators, and their application to HPC
Demonstrated track record of contributions to relevant open source projects, software standards, or community initiatives
Nationally or internationally recognized expertise in an HPC-related discipline
Benefits
Exceptional health and retirement benefits, including pension or 401K-style plans
Opportunities to grow in your career - check out our Tuition Assistance Program
A culture where you’ll belong - we are invested in our teams!
In addition to accruing vacation and sick time, we also have a Winter Holiday Shutdown every year.
Parental bonding leave (for both mothers and fathers)
Pet insurance
Company
Berkeley Lab
Berkeley Lab is a national laboratory that creates advanced new tools for scientific discovery.
Funding
Current Stage
Late StageLeadership Team
Recent News
MIT Climate Portal - Massachusetts
2025-07-18
Help Net Security
2025-04-15
Company data provided by crunchbase