CoreWeave · 3 days ago
Staff Software Engineer
CoreWeave is The Essential Cloud for AI™, delivering a platform that enables innovators to build and scale AI with confidence. As a Staff Software Engineer, you will architect large-scale infrastructure services for GPU servers, focusing on security, reliability, and scalability, while leading technical projects and mentoring engineers.
Artificial Intelligence (AI)Cloud ComputingCloud InfrastructureInformation TechnologyMachine Learning
Responsibilities
Provide technical leadership in designing, architecting, and operating large-scale infrastructure services for GPU servers, with a focus on security, reliability, and scalability
Build and enhance infrastructure services and automation, including inventory management systems and lifecycle management solutions using open source technologies
Drive strategic direction for infrastructure automation, lifecycle management, and service orchestration, making MetalDev core services more scalable and resilient
Define best practices for API development (REST/gRPC), distributed databases, and Kubernetes orchestration—while mentoring engineers to follow your lead
Partner with hardware, software, and operations teams to align infrastructure with business impact
Contribute to open source communities (e.g., Go, Redfish) through collaboration and technical thought leadership
Lead and improve CI/CD pipelines for hardware compliance, firmware management, and data systems
Champion reliability and operational excellence by driving observability (Prometheus/Grafana), production incident response, and continuous service improvement
Qualification
Required
B.S., M.S., or PhD in Computer Science or related field, or equivalent experience
8+ years of software engineering experience with a strong focus on infrastructure, cloud engineering, and distributed databases—particularly within large-scale datacenter and cloud environments
Expertise in Go and proven experience building REST/gRPC APIs for mission-critical platforms
Strong background in architecting and scaling cloud-native Kubernetes infrastructure and distributed services
Proven success in mentoring engineers, leading technical projects, and influencing engineering strategy across teams
Experience contributing to and collaborating with open source communities
Skilled in applying a data-driven approach to reliability, optimization, and continuous improvement
Excellent communicator able to work effectively with both technical and non-technical stakeholders
Hands-on experience with observability stacks (Prometheus, Grafana, PromQL), CI/CD pipelines, and operating large fleets of GPU servers
Track record of leading incident response, postmortems, and driving robust service reliability
Preferred
Working knowledge of Kafka, ClickHouse and CRDB
DMTF, RedFish APIs, and GPU servers
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
CoreWeave
CoreWeave is a cloud-based AI infrastructure company offering GPU cloud services to simplify AI and machine learning workloads.
Funding
Current Stage
Public CompanyTotal Funding
$23.37BKey Investors
Jane Street CapitalStack CapitalCoatue
2025-12-08Post Ipo Debt· $2.54B
2025-11-12Post Ipo Debt· $1B
2025-08-20Post Ipo Secondary
Recent News
2026-01-07
2026-01-06
Company data provided by crunchbase