NVIDIA · 2 days ago
Senior Software Engineer, AI Inference
NVIDIA is a leader in AI computing, transforming computer graphics and accelerated computing. They are seeking a Senior System Software Engineer to work on user-facing tools for the Dynamo Inference Server, focusing on building and maintaining distributed model management systems for large-scale AI inference workloads.
AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
Responsibilities
Build and maintain distributed model management systems, including Rust-based runtime components, for large-scale AI inference workloads
Implement inference scheduling and deployment solutions on Kubernetes and Slurm, while driving advances in scaling, orchestration, and resource management
Collaborate with infrastructure engineers and researchers to develop scalable APIs, services, and end-to-end inference workflows
Create monitoring, benchmarking, automation, and documentation processes to ensure low-latency, robust, and production-ready inference systems on GPU clusters
Qualification
Required
Bachelor's, Master's, or PhD in Computer Science, ECE, or related field (or equivalent experience)
6+ years of professional software engineering experience
Strong understanding of modern ML architectures with a keen intuition for optimizing inference performance
Take full ownership of problems end-to-end, proactively acquiring any knowledge or skills needed to deliver results
Familiar with or able to quickly gain expertise in vLLM, SGLang, PyTorch, NVIDIA GPUs, and supporting software stacks such as NIXL, NCCL, CUDA, as well as HPC technologies like InfiniBand, MPI, and NVLink
Experienced in architecting, building, monitoring, and debugging production-grade distributed systems; bonus if you've worked on performance-critical ones
Preferred
Experience with inference-serving frameworks (e.g., Dynamo Inference Server, TensorRT, ONNX Runtime) and deploying/managing LLM inference pipelines at scale
Contributions to large-scale, low-latency distributed systems (open-source preferred) with proven expertise in high-availability infrastructure
Strong background in GPU inference performance tuning, CUDA-based systems, and operating across cloud-native and hybrid environments (AWS, GCP, Azure)
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
Business Insider
2026-01-09
Business Insider
2026-01-09
Company data provided by crunchbase