Software Engineer, Platform Systems jobs in United States
cer-icon
Apply on Employer Site
company-logo

OpenAI · 2 hours ago

Software Engineer, Platform Systems

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. As a Software Engineer on the Platform Systems team, you will design and build distributed systems that enhance the visibility and reliability of large-scale AI training workloads.

Agentic AIArtificial Intelligence (AI)Foundational AIGenerative AIMachine LearningNatural Language ProcessingSaaS
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Design and build distributed failure detection, tracing, and profiling systems for large-scale AI training jobs
Develop tooling to identify slow, faulty, or misbehaving nodes and provide actionable visibility into system behavior
Improve observability, reliability, and performance across OpenAI’s training platform
Debug and resolve issues in complex, high-throughput distributed systems
Collaborate with systems, infrastructure, and research teams to evolve platform capabilities
Extend and adapt failure detection systems or tracing systems to support new training paradigms and workloads

Qualification

Distributed systemsFailure detectionPerformance analysisHigh-performance computingLow-level systems engineeringAutomationCollaborationProblem-solving

Required

Design and build distributed failure detection, tracing, and profiling systems for large-scale AI training jobs
Develop tooling to identify slow, faulty, or misbehaving nodes and provide actionable visibility into system behavior
Improve observability, reliability, and performance across OpenAI's training platform
Debug and resolve issues in complex, high-throughput distributed systems
Collaborate with systems, infrastructure, and research teams to evolve platform capabilities
Extend and adapt failure detection systems or tracing systems to support new training paradigms and workloads
Care deeply about performance, stability, and observability in distributed systems
Enjoy finding and fixing issues in large-scale systems and automating operational workflows
Have experience writing low-level software where system details matter
Understand hardware, operating systems, networking, concurrency, and distributed systems
Have a background in high-performance computing or low-level systems engineering
Are excited to work on critical infrastructure that powers frontier AI research

Company

OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT. It is a sub-organization of OpenAI Foundation.

H1B Sponsorship

OpenAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2024 (1)
2023 (1)
2022 (18)
2021 (10)
2020 (6)

Funding

Current Stage
Growth Stage
Total Funding
$79B
Key Investors
The Walt Disney CompanySoftBankThrive Capital
2025-12-11Corporate Round· $1B
2025-10-02Secondary Market· $6.6B
2025-03-31Series Unknown· $40B

Leadership Team

leader-logo
Fidji Simo
CEO, Applications
linkedin
leader-logo
Sam Altman
CEO & Co-Founder
Company data provided by crunchbase