OpenAI · 6 months ago
Software Engineer, Fleet Hardware Health
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. As a software engineer on the Fleet Hardware team, you will be responsible for the reliability and uptime of OpenAI’s compute fleet, developing automation systems and tools to monitor server health and performance. This role requires a focus on system-level investigations and the development of automated solutions to maintain the health and efficiency of supercomputing infrastructure.
Agentic AIArtificial Intelligence (AI)Foundational AIGenerative AIMachine LearningNatural Language ProcessingSaaS
Responsibilities
Build and maintain automation systems for provisioning and managing server fleets
Develop tools to monitor server health, performance, and lifecycle events
Collaborate with clusters, networking, and infrastructure teams
Partner with external operators to ensure a high level of quality
Identify and fix performance bottlenecks and inefficiencies
Continuously improve automation to reduce manual work
Qualification
Required
Experience managing large-scale server environments
A balance of strengths in building and operationalizing
Proficiency in Python, Go, or similar languages
Strong Linux, networking, and server hardware knowledge
Comfort digging into noisy data with SQL, PromQL, and Pandas or any other tool
Preferred
Experience with low level details of hardware components, protocols, and associated Linux tooling (e.g., PCIe, Infiniband, networking, power management, kernel perf tuning)
Knowledge of hardware management protocols (e.g., IPMI, Redfish)
High-performance computing (HPC) or distributed systems experience
Prior experience developing, managing, or designing hardware
Familiarity with monitoring tools (e.g., Prometheus, Grafana)
Company
OpenAI
OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT. It is a sub-organization of OpenAI Foundation.
H1B Sponsorship
OpenAI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2024 (1)
2023 (1)
2022 (18)
2021 (10)
2020 (6)
Funding
Current Stage
Growth StageTotal Funding
$79BKey Investors
The Walt Disney CompanySoftBankThrive Capital
2025-12-11Corporate Round· $1B
2025-10-02Secondary Market· $6.6B
2025-03-31Series Unknown· $40B
Recent News
2026-01-09
The Motley Fool
2026-01-09
Company data provided by crunchbase