Inside Higher Ed · 1 week ago
HPC Scientific Software Engineer (IT@JH Research Computing)
Inside Higher Ed is seeking an HPC Scientific Software Engineer to develop and optimize software solutions for high-performance computing and AI systems. The role involves collaborating with cross-functional teams to enhance system performance, manage software deployment, and provide technical support and training workshops.
Digital MediaEducationHigher EducationJournalismRecruiting
Responsibilities
Develop and refine deployment strategies for scientific software on HPC and AI systems
Design computational workflows, selecting optimal software configurations, and utilizing tools like Ansible for automation
Assist teams in implementing, tuning, and optimizing AI models and gateway applications (e.g., XDMoD, Coldfront, Open OnDemand, CryoSPARC Live, SBGrid, AI Agents)
Analyze and optimize the performance of AI models and HPC applications, focusing on GPU-enabled computing
Implement parallel processing, distributed computing, and resource management techniques for efficient job execution
Develop, debug, and maintain software tools, libraries, and frameworks supporting HPC and AI workloads
Collaborate with the system team and software vendors (e.g., NVIDIA, Intel, Matlab) to optimize systems for maximum performance
Utilize CUDA, DNN, TensorRT, and Intel Compilers to enhance system performance
Manage and support scientific software deployment across HPC, cloud-based, and colocation facilities
Oversee installation, configuration, and maintenance of HPC packages with tools like CMake, Make, EasyBuild, Spack, and Lua module files
Work closely with cross-functional teams, including researchers, data scientists, and software developers, to address complex HPC/AI challenges
Mentor junior engineers and foster a culture of continuous learning
Resolve complex technical issues and perform root cause analysis for HPC/AI software challenges
Implement effective solutions to prevent recurrence and improve system reliability
Provide training workshops for researchers and students, focusing on troubleshooting, optimizing workflows, and effectively using HPC systems
Stay current with advances in HPC and AI technologies and methodologies
Incorporate new research findings into existing systems to improve performance and capabilities
Develop and manage container orchestration strategies to ensure scalability, reliability, and security of applications
Oversee the container lifecycle from creation and deployment to scaling and removal
Create comprehensive documentation for system designs, performance metrics, and project status
Ensure compliance with security and regulatory standards for all HPC and AI systems
Qualification
Required
Master's Degree in a quantitative discipline
Five (5) years of experience in HPC user support, software deployment, and performance optimization within an academic or research environment
Experience in scientific computing environments and applications
Hands-on experience with SLURM, for job scheduling
Proficiency in Python, Perl, C/C++, and Shell scripting for automation and system management
Advanced knowledge of Linux systems and proficiency in scripting languages such as Python, Perl, and Shell
Familiarity with scientific application management tools such as Containerization, LUA modules, CMake, Spack, and EasyBuild
Training Workshops, Performance Optimization and Troubleshooting
Additional education may substitute for required experience, and additional related experience may substitute for required education beyond a high school diploma/graduation equivalent, to the extent permitted by the JHU equivalency formula
Preferred
PhD in a quantitative discipline, such as Computer Science Engineering, Physics, Bioinformatics, or related fields, with advanced training in scientific computing
Company
Inside Higher Ed
Inside Higher Ed is the online source for news, opinion, and jobs related to higher education.
Funding
Current Stage
Growth StageTotal Funding
unknown2022-01-10Acquired
2006-08-31Series Unknown
Recent News
Research & Development World
2025-05-03
Business Standard India
2025-04-11
Company data provided by crunchbase