Software Engineer, Infrastructure Reliability jobs in United States
cer-icon
Apply on Employer Site
company-logo

OpenAI · 1 day ago

Software Engineer, Infrastructure Reliability

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are seeking Software Engineers to join their Applied Infrastructure organization, focusing on scaling and hardening the infrastructure that powers widely used AI systems. The role involves designing reliable systems, identifying performance bottlenecks, and improving automation to enhance internal tooling and developer experience.

Agentic AIArtificial Intelligence (AI)Foundational AIGenerative AIMachine LearningNatural Language ProcessingSaaS
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Design, build, and operate reliable and performant systems used across engineering
Identify and fix performance bottlenecks and inefficiencies, ensuring our infrastructure can scale to the next order of magnitude
Dig deep to resolve complex issues
Continuously improve automation to reduce manual work. Improve internal tooling and our developer experience
Contribute to incident response, postmortems, and the development of best practices around system reliability and scalability

Qualification

Distributed systemsCloud infrastructureKubernetesObservability toolsMicroservices architectureSecurity best practicesProblem-solvingCollaborationAdaptability

Required

4+ years of relevant industry experience, with 2+ years leading large scale, complex projects or teams as an engineer or tech lead
A passion for distributed systems at scale with a focus on reliability, scalability, security, and continuous improvement
Proven experience as a reliability engineer, production engineer, or a similar role in a fast-paced, rapidly scaling company
Strong proficiency in cloud infrastructure (like AWS, GCP, Azure) and IaC tools such as Terraform. Proficiency in programming / scripting languages
Experience with containerization technologies and container orchestration platforms like Kubernetes
Experience with observability tools such as Datadog, Prometheus, Grafana, Splunk and ELK stack
Experience with microservices architecture and service mesh technologies
Knowledge of security best practices in cloud environments
Strong understanding of distributed systems, networking, and database technologies
Excellent problem-solving skills and ability to work in a fast-paced environment

Company

OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT. It is a sub-organization of OpenAI Foundation.

H1B Sponsorship

OpenAI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2024 (1)
2023 (1)
2022 (18)
2021 (10)
2020 (6)

Funding

Current Stage
Growth Stage
Total Funding
$79B
Key Investors
The Walt Disney CompanySoftBankThrive Capital
2025-12-11Corporate Round· $1B
2025-10-02Secondary Market· $6.6B
2025-03-31Series Unknown· $40B

Leadership Team

leader-logo
Fidji Simo
CEO, Applications
linkedin
leader-logo
Sam Altman
CEO & Co-Founder
Company data provided by crunchbase