Leidos · 3 months ago
Graphics Processing Unit (GPU) Engineer
Leidos is a company that delivers innovative solutions through its diverse and talented workforce. They are seeking a highly skilled Graphics Processing Unit (GPU) Engineer to design, develop, and optimize GPU clusters that power enterprise AI for mission customers, ensuring smooth integration with Linux-based systems and optimizing performance.
ComputerGovernmentInformation ServicesInformation TechnologyNational SecuritySoftware
Responsibilities
Design, configure, and maintain GPU Clusters
Collaborate with a multidisciplinary team to define and optimize architectures, ensuring they meet performance, power efficiency, and feature requirements
Work closely with AI/ML engineers to ensure smooth GPU integration with Linux-based systems
Optimize GPU drivers for compatibility, reliability, and performance
Provide regular maintenance and updates
Analyze GPU performance, identify bottlenecks, and develop strategies to improve efficiency across hardware and software layers
Build and maintain debugging tools, profiling utilities, and performance analysis software for Linux environments
Leverage scripting and configuration tools such as Bash, Python, Ansible, Puppet, and Salt
Maintain technical documentation, architectural specifications, and Linux best practices
Support ATO (Authority to Operate) and ensure compliance with federal security standards
Qualification
Required
Bachelor's or higher degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field with at least 12 years of related technical experience. Additional years of experience may be considered in lieu of a degree
10+ years of relevant systems engineering experience
Experience in managing NVIDIA GPU data center platforms. (DGX, HGX, H200, H100, L4s)
Knowledge of enterprise server components (storage/network controllers, HBA, SSDs)
Strong expertise with Linux distributions. (RHEL, Ubuntu, Oracle, and Rocky)
Excellent problem-solving skills and the ability to collaborate within a team
Candidate must, at a minimum, meet DoD 8570.11- IAT Level II certification requirements (currently Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP along with an appropriate computing environment (CE) certification). An IAT Level III certification would also be acceptable (CASP+, CCNP Security, CISA, CISSP, GCED, GCIH, CCSP)
Active TS/SCI clearance with Polygraph required OR active TS/SCI and willingness to obtain and maintain a Poly
US Citizenship is required due to the nature of the government contracts we support
Preferred
Experience with Kubernetes cluster management and AI/ML workflow orchestration (Argo, Airflow, and Kubeflow)
Familiarity with GPU virtualization and cloud computing
Experience with Prometheus/Grafana for monitoring
Knowledge of distributed resource scheduling systems (Slurm (preferred), LSF, etc.)
Company
Leidos
Leidos is a Fortune 500® innovation company rapidly addressing the world’s most vexing challenges in national security and health.
Funding
Current Stage
Public CompanyTotal Funding
unknown2025-02-20Post Ipo Debt
2013-09-17IPO
Recent News
MarketScreener
2025-12-16
2025-12-16
Company data provided by crunchbase