Operations Manager jobs in United States
cer-icon
Apply on Employer Site
company-logo

Undisclosed ยท 1 day ago

Operations Manager

The client, a global hyperscaler cloud provider, is seeking a Senior Operations Manager to lead and scale mission-critical cloud operations supporting large enterprise and strategic customers. This role is responsible for end-to-end operational excellence across cloud infrastructure, service reliability, incident management, and cross-functional execution.

Financial Services

Responsibilities

Own day-to-day operations for a portfolio of hyperscaler cloud services or regions supporting enterprise and strategic customers
Ensure availability, reliability, scalability, and performance SLAs are consistently met or exceeded
Act as the operational owner for services spanning compute, storage, networking, and managed cloud platforms
Lead major incident management (SEV-1 / SEV-2), serving as the operational commander during high-impact outages
Coordinate cross-functional response across SRE, Engineering, Networking, Security, and Support teams
Drive root cause analysis (RCA), corrective actions, and long-term preventative initiatives
Establish and enforce incident response playbooks, escalation paths, and on-call readiness
Define, implement, and continuously improve operational processes aligned with hyperscaler best practices
Drive automation-first operations to reduce toil and improve mean time to detect (MTTD) and resolve (MTTR)
Establish standardized operating procedures for: Change management, Capacity planning, Service launches, Maintenance events, Disaster recovery and failover
Partner with Capacity Planning and Infrastructure teams to forecast demand and ensure global-scale readiness
Oversee operational execution of region expansions, service rollouts, and infrastructure scaling events
Ensure operational preparedness for peak traffic, large customer migrations, and global launches
Ensure operational compliance with security, privacy, and regulatory standards (e.g., SOC 2, ISO, PCI, HIPAA)
Partner with Security teams to operationalize: Incident response for security events, Access controls and change audits, Risk mitigation and compliance reporting, Own operational readiness for audits and compliance reviews
Serve as the operational bridge between: Engineering & SRE teams, Product & Service Owners, Customer Support & Account Teams, Executive leadership
Translate operational risks, trends, and metrics into clear executive-level insights and action plans
Influence roadmap prioritization through data-driven operational feedback
Define and track key operational metrics, including: Availability and reliability KPIs, Incident trends, Capacity utilization, Operational efficiency and automation coverage
Deliver regular operational reviews and executive readouts highlighting risks, improvements, and outcomes
Maintain detailed documentation of operational dependencies, runbooks, and service ownership

Qualification

Cloud OperationsHyperscale Cloud PlatformsDistributed SystemsIncident ManagementProcess ImprovementCapacity PlanningSecurity ComplianceExecutive CommunicationProcess-DrivenOutcomes-FocusedCloud CertificationsLeadership SkillsCross-Functional CollaborationDetail-Oriented

Required

7+ years of experience in Cloud Operations, SRE Operations, Infrastructure Operations, or Technical Operations Management
Proven experience operating hyperscale, distributed cloud platforms in a 24/7 environment
Strong background managing large-scale production systems with high availability and strict SLAs
Deep experience with hyperscaler cloud platforms: AWS, Azure, and/or Google Cloud Platform
Strong understanding of distributed systems and microservices architectures
Global networking (DNS, load balancing, traffic routing)
Compute, storage, and container orchestration (VMs, Kubernetes)
Familiarity with monitoring and observability systems
Incident management and alerting frameworks
CI/CD pipelines and infrastructure-as-code
Working knowledge of SRE principles including error budgets, reliability targets, and operational maturity models
Proven ability to lead under pressure during high-severity incidents
Strong executive communication skills with the ability to present operational data clearly and confidently
Experience managing and influencing cross-functional teams without direct authority
Detail-oriented, process-driven, and outcomes-focused

Preferred

Experience working directly within a hyperscaler environment (AWS, Azure, GCP, or equivalent)
Background in SRE, Infrastructure Engineering, or Cloud Platform Engineering
Experience supporting mission-critical, global-scale services
Cloud certifications (AWS, Azure, or GCP)
Experience operating in security-sensitive or regulated environments

Company

Undisclosed

twitter
company-logo
Financial Services

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Sean Blanton, Ph.D.
Co-Founder and Member of the Management Board
linkedin
leader-logo
Jeff Saginor
Chief Technology Officer
linkedin
Company data provided by crunchbase