CoreWeave · 2 weeks ago
Senior Software Engineer, Server Fleet Infrastructure
CoreWeave is The Essential Cloud for AI™, delivering a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. The Senior Software Engineer will design and build software that manages complex infrastructure across globally distributed datacenters, focusing on high-performance computing systems that power large AI workloads.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingCloud InfrastructureInformation TechnologyMachine Learning
Responsibilities
Design and implement solutions to problems of scale for multi-site deployment and management of CoreWeave’s global server hardware fleet
Build and maintain backend services and APIs (gRPC/REST) in Go or Python to interact with Kubernetes and other infrastructure systems
Develop provisioning services, automation workflows, and fleet management tools that span from bare metal to container orchestration
Write and maintain Kubernetes custom controllers and operators to automate infrastructure behavior
Design and implement observability solutions for large-scale server monitoring to improve system stability and insight
Adapt and extend open source tooling to enhance visibility into system metrics, performance, and health
Create test plans, deployment automation, dashboards, alerts, and insights into our fleet operations
Resolve integration challenges across the entire infrastructure stack, from data center hardware to orchestration platforms
Participate in an on-call rotation
Qualification
Required
5+ years of experience in software or infrastructure engineering
Proficiency in Go and/or Python software development
Familiarity with CI/CD tools like Argo, Flux, and GitHub Actions
Strong understanding of Linux internals
Preferred
Experience designing, implementing, and monitoring Kubernetes operators for custom resource definitions
Experience with infrastructure automation and configuration management tools like Ansible, Puppet, Chef, Salt
Experience with distributed cloud computing principles, including testing strategies, observability, error budgets, and fault-tolerant design
Experience implementing metrics pipelines, custom alerts, and monitoring strategies
Ability to break down complex problems into achievable tasks and collaborate with teammates to execute them
Willingness and ability to thrive in a fast-paced startup environment
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
CoreWeave
CoreWeave is a cloud-based AI infrastructure company offering GPU cloud services to simplify AI and machine learning workloads.
Funding
Current Stage
Public CompanyTotal Funding
$23.37BKey Investors
Jane Street CapitalStack CapitalCoatue
2025-12-08Post Ipo Debt· $2.54B
2025-11-12Post Ipo Debt· $1B
2025-08-20Post Ipo Secondary
Recent News
The Motley Fool
2026-01-09
The Motley Fool
2026-01-09
Company data provided by crunchbase