NVIDIA · 4 days ago
Director, Technical Program Management - AI and ML Platforms
NVIDIA is a leading technology company seeking a Director of Technical Program Management to lead AI/ML Platform initiatives within the DGX Cloud Infrastructure team. This role focuses on coordinating extensive multi-functional programs to enhance the development and deployment of AI models, ensuring a seamless integration of hardware and orchestration for optimal performance.
AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
Responsibilities
Lead and scale the Technical Program Management organization responsible for the DGX Cloud AI/ML platform, enabling over 1,000+ NVIDIA researchers globally
Drive the roadmap for end-to-end AI/ML infrastructure, spanning cluster bring-up, workload orchestration, GPU resource management, and integration with MLOps pipelines
Collaborate with leaders in technology and innovation to outline platform needs, synchronize computing approach with AI model advancement, and provide a seamless researcher journey
Lead complex programs involving next-generation systems (e.g., GB200) and fleet-wide scaling initiatives across OCI, GCP, and other hyperscalers
Own platform efficiency and capacity management, using deep understanding of scheduling systems (e.g., Slurm, hybrid models) to optimize job placement, utilization, and turnaround time
Establish data-driven operational metrics availability, occupancy, wait times, throughput and use them to guide continuous improvement and prioritization
Implement governance and visibility frameworks that drive alignment, predictability, and accountability across AI platform initiatives
Represent DGX Cloud programs to senior leadership, clearly articulating impact, risk, and value across engineering and research organizations
Qualification
Required
15+ overall years of technical program management experience, including 7+ years leading and developing TPM teams in infrastructure, AI/ML, or platform engineering domains
Demonstrated success in implementing AI and machine learning systems and platform initiatives at a large scale encompassing workload coordination, data pipeline incorporation, model training environments, and GPU fleet supervision
Deep technical understanding of AI/ML workflows, job scheduling (Slurm, Kubernetes, hybrid orchestration), and large-scale distributed systems
Proficiency in optimizing resource usage and monitoring performance metrics in compute-heavy settings
Experience building platforms across cloud and on-prem hybrid architectures, integrating with internal and external MLOps stacks
Proficiency with observability and telemetry tools (e.g., Grafana, Prometheus) for infrastructure monitoring and performance analysis
Bachelor or Master in Computer Science, Engineering, or related field (or equivalent experience)
Preferred
Proficient in AI/ML systems, model lifecycle oversight, and developer tools for extensive training tasks
Track record driving R&D productivity platforms and reducing friction for machine learning practitioners
Experience in new product introduction (NPI) for research and infrastructure systems
Deep familiarity with cloud compute and orchestration technologies, and a passion for automation and operational excellence
Executive communication skills, able to translate complex technical programs into clear business and research outcomes
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
2026-01-08
Company data provided by crunchbase