CoreWeave · 11 hours ago
Infrastructure Operations Program Manager
CoreWeave is The Essential Cloud for AI™, focused on providing infrastructure for AI workloads. The Infrastructure Operations Program Manager will operationalize and scale CoreWeave’s bare metal support and RMA programs, ensuring effective communication and alignment between clients, internal teams, and external vendors while improving operational processes and tools.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingCloud InfrastructureInformation TechnologyMachine Learning
Responsibilities
Own and drive RMA workflows, coordinate with internal teams and vendors to streamline how hardware failures are diagnosed, escalated, replaced, and tracked — and use data to monitor trends, drive accountability, and reduce repeat issues
Lead cross-functional maintenance efforts, helping ensure that both proactive and reactive maintenance (upgrades, replacements, power/cooling work, etc.) are planned, communicated, and executed with minimal disruption to customers — and that post-maintenance reviews are informed by meaningful KPIs
Act as a bridge between Support, FleetOps, and the client by representing the customer’s voice internally and ensuring that field work, hardware tracking, and incident handling are executed cleanly and understood
Build repeatable, measurable processes by turning ad hoc or one-off actions into standardised, trackable workflows that can scale across regions, vendors, and clients — with dashboards or reporting in place to track backlog, resolution time, vendor performance, and program health
Contribute to the overall maturity of the program by helping the team transition to a metrics-driven, well-documented operational model that enables faster time-to-resolution, clearer accountability, and an improved customer experience
Qualification
Required
3+ years of technical program management experience in the cloud or high-performance computing space as it relates to server and data center operations
Proven technical project management experience in cloud or high-performance computing, specifically server and data centre operations
Demonstrated technical ability to assimilate complex technical concepts to better operational processes
Proven problem-solving skills and adaptability in a fast-paced environment
Excellent communication and collaboration abilities, with experience working effectively across multidisciplinary internal and external teams
Familiarity with Linux, containerization technologies, and cloud computing concepts
Demonstrated hardware troubleshooting knowledge within an infrastructure space
Expertise in working with supply chain and engineering teams to deliver value to customers, driving improvements to operations and tooling
Proven experience with implementing process workflows, ticketing systems, data analysis, and tooling/methodology for creating and maintaining detailed documentation for customer-facing reports
Demonstrated leadership ability and excellent communication skills
Problem-Solving & Adaptability: Robust problem-solving skills and adaptability in a fast-paced environment
Applicants must have work authorisation that does not require sponsorship from the company now or in the future
Preferred
You're curious about Kubernetes, Docker, and containerized infrastructure
You have strong problem-solving skills with a proactive and analytical mindset
You have excellent communication skills and a demonstrated ability to work collaboratively in a fast-paced environment
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
CoreWeave
CoreWeave is a cloud-based AI infrastructure company offering GPU cloud services to simplify AI and machine learning workloads.
Funding
Current Stage
Public CompanyTotal Funding
$23.37BKey Investors
Jane Street CapitalStack CapitalCoatue
2025-12-08Post Ipo Debt· $2.54B
2025-11-12Post Ipo Debt· $1B
2025-08-20Post Ipo Secondary
Recent News
2026-01-13
The Motley Fool
2026-01-13
2026-01-13
Company data provided by crunchbase