Director, Global Network Reliability Engineering jobs in United States
cer-icon
Apply on Employer Site
company-logo

NVIDIA · 3 weeks ago

Director, Global Network Reliability Engineering

NVIDIA is a leading AI company seeking a Director of Network Reliability Engineering within its Enterprise Networking organization. The role involves overseeing global network operations, ensuring reliability and efficiency, while leading a team to implement a data-driven approach to operations and continuous improvements.

AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Your main focus will be maturing the current support model and processes to a more data driven, automated, SRE model
Build an in-house team of reliability experts for networking support and operations from the existing outsourced SMES , providing leadership, direction, and strategy for a growing team
Set the technical vision, strategy, and roadmap for network operations in partnership with the key infrastructure and partner teams
Work across Network Architecture, Network engineering and partner well to establish run books, regular training sessions and ensure we build the network to be self-healing
Work very well in understanding RCAs from events and incidents and work with our AI operations to enrich our observability tooling for better full stack view of the network to applications
Influence the architecture of the Nvidia networks both on-prem and in the clouds

Qualification

Network Reliability EngineeringData Driven OperationsNetwork ArchitectureTeam LeadershipSRE PrinciplesSystem DesignTechnical Deep-DivesSoftware Interface DesignInnovative ThinkingProblem-Solving SkillsCollaboration Skills

Required

Bachelor's degree in Computer Science, related technical field, or equivalent experience
Experience building and growing teams that are geographically distributed, appreciate local operations and bring in a global perspective, following standards
Ability to do technical deep-dives into code, networking, operating systems, and storage, as well as being verbally and cognitively agile enough to hold your own in strategy discussions with NVIDIA's executive team and peer SMEs
Ability to identify trends and promote solutions that solve challenges efficiently across multiple product areas
Excellent innovative thinking, collaboration, and problem-solving skills
12+ overall years of experience with system design, network architecture, network engineering, and network operations and 7+ years Leadership of experience

Preferred

Experience transforming network operations using software driven methods
Experience in a Hyperscale Cloud Service Provider (public facing or not)
Knowledge of SRE principles (observability, SLOs, SLIs, logging, etc)
Knowledge of software interface design & documentation for less technical end-users

Benefits

Equity
Benefits

Company

NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.

H1B Sponsorship

NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)

Funding

Current Stage
Public Company
Total Funding
$4.09B
Key Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity

Leadership Team

leader-logo
Jensen Huang
Founder and CEO
linkedin
leader-logo
Michael Kagan
Chief Technology Officer
linkedin
Company data provided by crunchbase