NVIDIA · 2 weeks ago
Senior Manager, Cloud Operations Engineering
NVIDIA is a leader in computer graphics and accelerated computing, and they are seeking a highly skilled Senior Manager for their Cloud Operations Engineering team. In this role, you will drive the efficiency, reliability, and scalability of systems that support global business operations, while mentoring a team of engineers and implementing innovative automation solutions.
AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
Responsibilities
Lead, mentor, and develop a team of 4-8 engineers, providing technical guidance, performance feedback, and career development opportunities
Build and implement comprehensive monitoring, alerting, and reporting solutions using industry-standard tools
Develop and maintain automation pipelines to streamline operational workflows and reduce manual overhead
Coordinate incident, problem, and process adjustment procedures in alignment with ITSM guidelines
Collaborate with multi-functional teams to identify operational difficulties and implement solutions
Build and maintain internal operational tools and frameworks that enhance team productivity
Ensure alignment with security and compliance standards across all operational systems and processes
Define key performance indicators and metrics to measure operational health and team performance
Qualification
Required
BS/MS in Computer Science or a related technical field, or equivalent experience, combined with 8+ overall years of hands-on experience building, supporting, and managing complex services and infrastructure
Proven track record of 4+ years of leadership/management experience in a technical environment
Strong proficiency in Python for automation, data handling, and tool development
Hands-on experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, CloudWatch, or Splunk
Demonstrated expertise in ITSM practices, including incident, problem, and process improvement
Ability to implement secure and compliant offboarding procedures and manage access-related tasks
Strong understanding of IT operations, system workflows, and operational standards
Core knowledge of Java, including Collections API, Streams API, Concurrency, and I/O
Solid understanding of RDBMS and NoSQL databases, with hands-on experience in Cassandra, DynamoDB, or Redis
Preferred
Experience designing or implementing end-to-end automation pipelines and internal operational tools
Prior experience in security-conscious or compliance-heavy environments (financial services, healthcare, SaaS, etc.)
Expertise in creating comprehensive monitoring solutions, custom dashboards, and automated reporting mechanisms
Track record of success in fast-paced, high-growth environments with constantly evolving operational needs
Strong documentation habits and demonstrated commitment to continuous improvement and knowledge management
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
Business Insider
2026-01-09
Business Insider
2026-01-09
Company data provided by crunchbase