NVIDIA · 13 hours ago
Software Architect, NIM Factory
NVIDIA is the platform upon which every new AI-powered application is built. They are seeking a Software Architect to define and own the technical vision for the NVIDIA Inference Microservices (NIM) Factory, guiding the architectural direction for enterprise-grade AI services while ensuring reliability, performance, and security across thousands of GPUs.
Artificial Intelligence (AI)Consumer ElectronicsGPUHardwareSoftwareVirtual Reality
Responsibilities
Define the end-to-end technical architecture for the NIM Factory, from container build systems and CI/CD to Kubernetes deployment patterns and runtime optimization
Drive technical strategy and roadmap, making high-impact decisions on frameworks, technologies, and standards that empower dozens of engineering teams
Architect and influence the design of workflow orchestration systems that underpin the NIM factory
Guide and support senior engineers throughout the organization in building a culture centered on technical excellence and innovation
Advocate for guidelines in software development, encompassing API composition, automation, observability, and secure supply chain management
Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps
Qualification
Required
15+ years of experience building large-scale, production distributed systems
Consistent track record in a technical leadership or architect role, setting technical direction, and implementing
Deep architectural expertise in cloud-native technologies, including Kubernetes, containers, and microservices
Exceptional ability to mentor, and grow senior engineers with a passion for raising the technical bar of the entire organization
Proficiency in languages like Python for building tooling and services
Experience architecting solutions for GPU-accelerated or other high-performance computing workloads
Excellent communication and collaboration skills, with the ability to articulate complex technical concepts to diverse audiences and drive consensus
A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience
Preferred
Hands-on with LLM inference stacks (Triton Inference Server, TensorRT-LLM, vLLM)
Experience optimizing large-model serving (KV cache sharding/paging, tensor/sequence parallelism, speculative decoding, dynamic batching)
Experience architecting next-generation container build systems or CI/CD platforms at scale
Background with workflow orchestration engines (e.g., Temporal, Airflow) for complex, distributed processes
Expertise in designing multi-tenant, multi-cluster, or edge/air-gapped deployment architectures
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1418)
2024 (1356)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
Deccan Chronicle
2025-12-31
IEEE Spectrum
2025-12-31
Company data provided by crunchbase