HPC/AI Programming Environment Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Berkeley Lab · 1 day ago

HPC/AI Programming Environment Engineer

Lawrence Berkeley National Lab's NERSC Division is seeking an HPC/AI Programming Environment Engineer to enhance the AI and HPC software environment on their flagship systems. The role involves developing and supporting software frameworks for HPC/AI workloads, collaborating with vendors, and ensuring programming environments meet scientific workflow needs.

Research
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Develop, integrate, and support software frameworks and tools that enable HPC/AI workloads within the NERSC HPC software environment on Perlmutter, Doudna, and future systems
Enable and optimize software environment technologies, including runtime integration, testing, and development of advanced capabilities for Doudna (NERSC-10), a state-of-the-art NVIDIA Vera Rubin system integrated by Dell
Serve as a liaison with NESAP science teams to understand workflow requirements and ensure programming environments meet the needs of scientific workloads
Collaborate with vendors to prioritize, develop, and enhance their technologies to meet the needs of DOE Office of Science application codes and workflows
Evaluate emerging technologies for their applicability to NERSC's scientific workloads
Measure and analyze performance and scalability of software frameworks and runtimes on current and future platforms
Contribute software engineering expertise to cross-team NERSC activities and collaborate across Berkeley Lab and the DOE Office of Science community
Prepare technical documentation, reports, papers, presentations and training materials describing significant results for dissemination within NERSC and the broader research community
Work directly with scientists and developers to ensure correct and optimal usage of software technologies and ensure requirements are met by future development
Provide technical leadership and mentorship within the PEM group and across NERSC
Lead development and deployment efforts for major programming environment initiatives
Represent NERSC in vendor engagements, standards bodies, and the broader HPC community
Work with greater independence and drive strategy for areas of responsibility

Qualification

HPC software stacksAI/ML frameworksParallel programmingLinux environmentContainer technologiesPerformance analysisTechnical documentationCommunication skillsTeam collaboration

Required

Bachelor's degree in Computer Science, Computational Science, Physical Sciences, or related field with a minimum of 8 years of related experience; or Master's degree with 6 years of experience; or equivalent experience
Experience with HPC software stacks and/or AI/ML frameworks such as PyTorch, TensorFlow, JAX, or similar technologies
Experience with state-of-the-art languages, methods, and tools used to program, profile, and debug parallel scientific applications and workflows, such as MPI, OpenMP, CUDA, C++, Rust, Python, or Fortran
Knowledge of the Linux environment
Excellent written and oral communication skills
Demonstrated ability to work effectively as part of a cross-disciplinary team
Minimum of 12 years of related experience with a Bachelor's degree; or 8 years with a Master's degree; or equivalent experience
Track record of technical leadership or leading collaborative projects
Recognized expertise and established professional network in HPC or related fields

Preferred

Ph.D. in Computer Science, Computational Science, Physical Sciences, or related field
Experience with production HPC environments and deploying services at scale
Experience with high-performance interconnects and distributed communication libraries for HPC and AI workloads, such as MPI, NCCL, libfabric, or UCX
Experience with container technologies (e.g., Docker, Podman, Singularity/Apptainer) and their application in HPC environments
Experience with hardware and software technologies in emerging areas such as cloud computing, AI accelerators, and their application to HPC
Demonstrated track record of contributions to relevant open source projects, software standards, or community initiatives
Nationally or internationally recognized expertise in an HPC-related discipline

Benefits

Exceptional health and retirement benefits, including pension or 401K-style plans
Opportunities to grow in your career - check out our Tuition Assistance Program
A culture where you’ll belong - we are invested in our teams!
In addition to accruing vacation and sick time, we also have a Winter Holiday Shutdown every year.
Parental bonding leave (for both mothers and fathers)
Pet insurance

Company

Berkeley Lab

twittertwittertwitter
company-logo
Berkeley Lab is a national laboratory that creates advanced new tools for scientific discovery.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Mary Barnum, MBA
Business Manager, COO Office
linkedin
leader-logo
Rebecca Rishell
Deputy Chief Operating Officer
linkedin
Company data provided by crunchbase