NVIDIA · 1 day ago
Senior Software Engineer, Cloud-Native Stack – CSP Engagements
NVIDIA is a leading technology company known for its advancements in Artificial Intelligence and High-Performance Computing. They are seeking a Senior Software Engineer for their CSP Engagements team to focus on the cloud-native stack for AI/ML datacenters, where the role involves defining customer workflows, prototyping stack enhancements, and debugging complex issues in multi-rack environments.
Responsibilities
Perform deep-dive debugging of multi-rack, multi-tenant clusters: scheduler behavior, container runtime issues, device-plugin crashes, RDMA/IB fabric anomalies, etc
Gather customer requirements and prototype feature extensions for Kubernetes operators, Slurm plugins, and custom micro-services that expose new GPU capabilities
Drive joint architecture reviews and “whiteboard” sessions with CSP and internal platform teams; convert findings into RFCs and upstream pull requests
Create reproducible testbeds (Helm/Ansible/Terraform) that mirror customer environments; automate validation and benchmark suites
Deliver technical collateral-design docs, how-to guides, demo scripts-and present at customer on-sites, KubeCon, and SlurmUG
Collaborate with AE, FAE, and Solution Architect teams to deliver integrated customer solutions and technical documentation
Qualification
Required
Strong source-level expertise in Kubernetes internals (scheduler, CRI/CNI/CSI, operators) and Slurm (federation, power-save, plugins)
Hands-on experience integrating next-gen GPUs (Blackwell/GB200/GB300) or comparable accelerators into containerized clusters
Proven track record debugging large-scale, cloud-native stacks across networking (RDMA/RoCE), storage, and control planes
Customer-facing engineering or solutions-architect background: requirements gathering, PoC ownership, roadmap influence
Familiarity with CI/CD (GitHub Actions, Tekton), observability (Prometheus, OpenTelemetry), and infrastructure-as-code
Excellent communication-able to switch between deep technical detail and high-level business impact
6+ years of professional software development experience in distributed systems (Go, Rust, C/C++ or Python for tooling)
BS or MS (or equivalent experience) in Computer Engineering, Computer Science, or related field
Preferred
Upstream contributions to Kubernetes, Slurm, Volcano, or similar projects
Experience with GPU computing (CUDA), deep learning workloads
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
2026-01-08
Company data provided by crunchbase