CoreWeave · 1 day ago
Senior Hardware Engineer, GPU & PCIe
CoreWeave is The Essential Cloud for AI™, delivering a platform of technology and tools that enable innovators to build and scale AI with confidence. The Senior Hardware Engineer will focus on GPU and PCIe troubleshooting, playing a crucial role in the design, development, and optimization of server hardware infrastructure while collaborating with cross-functional teams and external vendors.
Artificial Intelligence (AI)Cloud ComputingCloud InfrastructureInformation TechnologyMachine Learning
Responsibilities
Troubleshoot complex GPU and PCIe related failures
Partner with external vendors on failure analysis
Track component RMAs
Develop and maintain hardware/firmware management services
Automate all aspects of the server hardware lifecycle
Serve as the senior point of contact for hardware escalation and troubleshooting
Collaborate with cross-functional teams to define hardware requirements, specifications, system architecture and issue identification and resolution playbooks
Create and maintain accurate documentation of hardware designs, specifications, test procedures, and results
Analyze and optimize the performance of hardware systems, identify bottlenecks, and propose improvements for enhanced efficiency
Establish processes for internal hardware testing, deployment, performance optimization and troubleshooting
Qualification
Required
5+ years of prior experience supporting and troubleshooting data center class GPUs (H100 or newer, including Infiniband and NVLink)
Proficiency in ansible/python and experience with programmatically interacting with server BMCs, using IPMI or Redfish (preferably Redfish)
Experience using, integrating and automating data center class GPU diagnostics and troubleshooting tools, including observability platforms like prometheus and grafana
In-depth knowledge of server hardware, components, and management technologies, particularly GPUs and PCIe devices
Proven ability to stay updated with the latest industry technologies and trends
Previous experience collaborating with hardware vendors to identify novel issues, generate operational playbooks, create alerts and drive issue resolution to completion
Strong passion for automation, with a commitment to automating processes comprehensively
Excellent documentation skills and attention to detail
Strong analytical and problem-solving abilities
Benefits
Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption
Company
CoreWeave
CoreWeave is a cloud-based AI infrastructure company offering GPU cloud services to simplify AI and machine learning workloads.
Funding
Current Stage
Public CompanyTotal Funding
$23.37BKey Investors
Jane Street CapitalStack CapitalCoatue
2025-12-08Post Ipo Debt· $2.54B
2025-11-12Post Ipo Debt· $1B
2025-08-20Post Ipo Secondary
Recent News
2026-01-08
2026-01-08
Company data provided by crunchbase