Senior Software Engineer, Cloud-Native Stack – CSP Engagements jobs in United States
cer-icon
Apply on Employer Site
company-logo

NVIDIA · 5 months ago

Senior Software Engineer, Cloud-Native Stack – CSP Engagements

NVIDIA is a leading technology company known for groundbreaking developments in Artificial Intelligence and High-Performance Computing. They are seeking a Senior Software Engineer for their CSP Engagements team to focus on the cloud-native stack for datacenter products, tackling complex scheduling challenges and enhancing Kubernetes and Slurm functionalities.

Computer Hardware Manufacturing
check
H1B Sponsor Likelynote

Responsibilities

Perform deep-dive debugging of multi-rack, multi-tenant clusters: scheduler behavior, container runtime issues, device-plugin crashes, RDMA/IB fabric anomalies, etc
Gather customer requirements and prototype feature extensions for Kubernetes operators, Slurm plugins, and custom micro-services that expose new GPU capabilities
Drive joint architecture reviews and 'whiteboard' sessions with CSP and internal platform teams; convert findings into RFCs and upstream pull requests
Create reproducible testbeds (Helm/Ansible/Terraform) that mirror customer environments; automate validation and benchmark suites
Deliver technical collateral-design docs, how-to guides, demo scripts-and present at customer on-sites, KubeCon, and SlurmUG
Collaborate with AE, FAE, and Solution Architect teams to deliver integrated customer solutions and technical documentation

Qualification

Kubernetes internalsSlurmGPU integrationDistributed systemsCI/CDObservability toolsCustomer-facing engineeringPrototypingCommunicationTechnical documentation

Required

Strong source-level expertise in Kubernetes internals (scheduler, CRI/CNI/CSI, operators) and Slurm (federation, power-save, plugins)
Hands-on experience integrating next-gen GPUs (Blackwell/GB200/GB300) or comparable accelerators into containerized clusters
Proven track record debugging large-scale, cloud-native stacks across networking (RDMA/RoCE), storage, and control planes
Customer-facing engineering or solutions-architect background: requirements gathering, PoC ownership, roadmap influence
Familiarity with CI/CD (GitHub Actions, Tekton), observability (Prometheus, OpenTelemetry), and infrastructure-as-code
Excellent communication-able to switch between deep technical detail and high-level business impact
6+ years of professional software development experience in distributed systems (Go, Rust, C/C++ or Python for tooling)
BS or MS (or equivalent experience) in Computer Engineering, Computer Science, or related field

Preferred

Upstream contributions to Kubernetes, Slurm, Volcano, or similar projects
Experience with GPU computing (CUDA), deep learning workloads

Benefits

Equity
Benefits

Company

NVIDIA

twitter
company-logo
Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing.

H1B Sponsorship

NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)

Funding

Current Stage
Late Stage
Company data provided by crunchbase