SYSTEMS ENGINEER SR PRINCIPAL (HPC/AI System Administrator, Storage Engineer, Monitoring Expert, Solution Architect, Security/Provisioning Engineer, or Multi-discipline Expert) jobs in United States
cer-icon
Apply on Employer Site
company-logo

General Dynamics Information Technology ยท 1 week ago

SYSTEMS ENGINEER SR PRINCIPAL (HPC/AI System Administrator, Storage Engineer, Monitoring Expert, Solution Architect, Security/Provisioning Engineer, or Multi-discipline Expert)

General Dynamics Information Technology is a global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government. As a Systems Engineer Sr Principal, you will support the lifecycle sustainment and operational availability of High Performance Computing (HPC) clusters for the National Oceanographic and Atmospheric Administration, focusing on enhancing service delivery and customer satisfaction.

Artificial Intelligence (AI)Cloud ComputingConsultingCyber SecurityInformation Technology
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Lead/Manage/Support the day-day operations, sustainment, HPC services delivery, and incremental enhancements of two, geographically separated HPC clusters that are GDIT contractor owned and contractor operated (COCO) and used exclusively for WCOSS
Collaborate with the GDIT WCOSS team as a senior-level HPC functional expert addressing intricate and multifaceted HPC challenges by providing innovative ideas, solutions, and resolution for customer requests, issues, and improvement efficiencies on a continuous basis
Drive and prioritize resource utilization towards continuously improving customer satisfaction with GDIT's HPC service delivery and exceeding the contract service level metrics of uptime, availability, performance, stability, and on-time product delivery
Utilize past experience, team collaboration, system management and troubleshooting applications, and ingenuity to support customer operations while working on systems that range in capacity from 1000-3000+ nodes and 100's of PB storage per system

Qualification

High Performance ComputingLinuxScriptingNetworking ConceptsSystem Performance MonitoringTroubleshooting HardwareTeam CollaborationProject Leadership

Required

Education: Bachelor of Arts/Bachelor of Science
Experience: 10+ years of related experience
Technical skills: Highly proficient with Linux (RockyOS, SLES, etc), scripting in Python, Perl, or Bash, networking concepts and technology such as Ethernet, InfiniBand and Slingshot, TCP/IP networking, basic routing, and network services, programming in Python, C/C++, or Fortran, administrating PBSpro, SLURM or other batch systems in an HPC cluster, and system performance monitoring and tuning in an HPC cluster environment (e.g., Opensearch, Grafana, Prometheus)
Security clearance level: must complete a satisfactory background investigation
US citizenship required
Role requirements: Expected to perform as individual SME contributor, functional lead, or project/task leader responsible for workproduct delivery. Extensive experience in troubleshooting, diagnosing and repairing hardware failures to component level on servers; coordinating with vendors to resolve hardware and software problems
Minimal travel required for onsite work, team collaboration, training, and customer interaction

Benefits

Comprehensive benefits and wellness packages
401K with company match
Full flex work weeks
Paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave
15 days of paid leave per calendar year
10 paid holidays per year
Paid Family Leave program provides a total of up to 160 hours of paid leave in a rolling 12 month period
Short and long-term disability benefits
Life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance

Company

General Dynamics Information Technology

company-logo
General Dynamics Information Technology is an IT consulting company that specializes in cyber security, AI, and quantum computing. It is a sub-organization of General Dynamics.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Paul Nedzbala
Senior Vice President
linkedin
leader-logo
Ben Buckley
Vice President and General Manager
linkedin
Company data provided by crunchbase