Staff Site Reliability Engineer, Compute jobs in United States
cer-icon
Apply on Employer Site
company-logo

Crusoe · 2 weeks ago

Staff Site Reliability Engineer, Compute

Crusoe is on a mission to accelerate the abundance of energy and intelligence through sustainable technology. The Staff Site Reliability Engineer will support virtualization and optimize compute infrastructure, focusing on performance, security, and scalability for AI and HPC workloads.

AI InfrastructureArtificial Intelligence (AI)Data CenterEnergyEnergy ManagementOil and Gas
check
H1B Sponsor Likelynote

Responsibilities

Develop automation and observability tools to monitor Crusoe’s compute infrastructure, spanning from the kernel to orchestration layers
Support and scale the company’s virtualization stack, including technologies such as KVM, QEMU, and other hypervisors
Collaborate with Linux kernel and hardware teams to identify and resolve performance bottlenecks, driver issues, and optimize hardware offloads
Participate in root cause analysis for kernel crashes, hardware-software integration problems, and performance regressions
Integrate hypervisor-level enhancements to improve guest VM reliability and workload isolation
Tune kernel subsystems such as the process scheduler, NUMA configuration, memory management, and interrupt handling
Work closely with platform teams to implement and validate support for emerging compute hardware, including SmartNICs, BlueField devices, and TPUs

Qualification

Linux kernel internalsVirtualization technologiesProgramming languagesInfrastructure as CodeSystem-level debuggingCompute schedulingCollaborationProblem-solving

Required

8+ years of professional experience in Compute SRE, Linux system engineering, or compute infrastructure roles
Strong proficiency in Linux kernel internals, with exposure to scheduler, memory allocation, and driver subsystems
Experience with virtualization architectures and technologies such as KVM, Xen, QEMU, or VMware
Familiarity with SmartNICs/DPUs (e.g., NVIDIA CX6/7, BlueField-3) and kernel bypass techniques
Expert-level skills in at least one programming language: Go, C or Rust
Experience with system-level debugging, including kdump, kexec, and kernel panic analysis
Proficiency in Infrastructure as Code tooling and CI/CD practices for bare-metal or cloud infrastructure
Strong understanding of compute scheduling, resource management, and high-throughput networking

Benefits

Industry competitive pay
Restricted Stock Units in a fast growing, well-funded technology company
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Employer contributions to HSA accounts
Paid Parental Leave
Paid life insurance, short-term and long-term disability
Teladoc
401(k) with a 100% match up to 4% of salary
Generous paid time off and holiday schedule
Cell phone reimbursement
Tuition reimbursement
Subscription to the Calm app
MetLife Legal
Company paid commuter benefit; $300 per pay period

Company

Crusoe

twittertwittertwitter
company-logo
Crusoe is a vertically integrated AI infrastructure company that builds and operates data centers powered by energy sources.

H1B Sponsorship

Crusoe has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (69)
2024 (14)
2023 (2)
2022 (1)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$3.9B
Key Investors
Victory Park CapitalBrookfield Asset ManagementUpper90
2025-12-19Secondary Market
2025-10-23Series E· $1.4B
2025-08-25Debt Financing· $175M

Leadership Team

leader-logo
Chase Lochmiller
Co-Founder and Chief Executive Officer
linkedin
leader-logo
Cully Cavness
Co-Founder, President and Chief Strategy Officer
linkedin
Company data provided by crunchbase