HPC Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMERICAN SYSTEMS ยท 3 days ago

HPC Engineer

AMERICAN SYSTEMS is an employee-owned federal government contractor supporting national priority programs through strategic solutions in various areas. As an HPC Engineer, you will apply extensive knowledge of High-Performance Computing systems and provide technical expertise in support of software and hardware solutions, while also developing system requirements and documentation.

GovernmentInformation Technology
check
Comp. & Benefits
badNo H1BnoteSecurity Clearance RequirednoteU.S. Citizen Onlynote

Responsibilities

Apply comprehensive knowledge of High-Performance Computing (HPC) systems, comprised of high-speed, multi-petabyte Lustre file systems, Red Hat Enterprise Linux (RHEL) servers, CPU/GPU compute nodes, and high-performance storage arrays, using Ethernet, fiber, Omni-Path, and InfiniBand interconnections
Provide functional and technical expertise in support of user-developed software and technical advice and leadership to other technical staff
Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security compliance; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution
Diagnosing and troubleshoot technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies
Conduct needs analysis, planning, and scheduling the installation of a wide variety of new or modified hardware/software
Develop functional and technical IT system requirements and specifications. Configure and optimize system tools and applications, to include job schedulers (Slurm and PBSPro) and system resources (GitLab, LUA/TCL modules, and system support applications)
Create and brief technical presentations to technical and non-technical stakeholders. Maintain detailed documentation of system configurations, procedures, and troubleshooting guides. Develop user facing documentation

Qualification

High-Performance Computing (HPC)Linux/Unix systemsDistributed computingPython scriptingJob schedulers PBSJob schedulers SlurmHPC middlewareProblem diagnosisTechnical presentationsDocumentationTeam leadership

Required

DoD Top Secret (TS) clearance with SCI eligibility
Bachelor's in Computer Engineering, Computer Science, or related field and ten or more years of job-related experience
Thorough knowledge of complex concepts, practices, and troubleshooting associated with HPC cluster systems design, installation, and maintenance
Advanced knowledge in distributed computing theory, parallel processing, applications, and associated infrastructure is required
Extensive experience with Linux/Unix systems including installation, configuration, networking, backups, updates and patching, data archiving, and system security
Functional knowledge of HPC middleware, and platform managers such as Bright Cluster Manager; employing job schedulers such as PBS, Slurm, Torque, etc.; and, optimizing job queues
Experience with HPC or large-scale distributed computing environments and technologies such as high-speed low-latency interconnects (e.g. InifiniBand), parallel file systems (e.g. Lustre), and virtualization environments and tools (e.g. VMWare)
Experience developing Python/bash/Perl scripts and employing automation frameworks such as Ansible
General knowledge employing Docker containers and Kubernetes ecosystems
Working knowledge in one or more programming languages (e.g. C/C++, Fortran, etc.)

Benefits

Healthcare benefits
Paid leave
Retirement plans
Insurance programs
Education and training assistance

Company

AMERICAN SYSTEMS

company-logo
AMERICAN SYSTEMS is one of the largest employee-owned companies in the US.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
John Steckel
President & CEO
linkedin
leader-logo
Peter Whitfield
Chief Financial Officer
linkedin
Company data provided by crunchbase