Director, Enterprise Incident Response & Monitoring jobs in United States
cer-icon
Apply on Employer Site
company-logo

Manheim Dallas-Fort Worth · 1 day ago

Director, Enterprise Incident Response & Monitoring

Manheim Dallas-Fort Worth is a part of Cox Automotive, and they are seeking a Director of Incident Response & Enterprise Monitoring. This role is responsible for leading the Incident Response and Enterprise Monitoring teams, modernizing incident response practices, and ensuring operational resilience across the organization.

Automotive
badNo H1Bnote

Responsibilities

Lead the Incident Response and Enterprise Monitoring teams within the Enterprise Operations organization
Define and execute a modernization strategy for enterprise monitoring, transforming the practice from reactive alerting to proactive, insight-driven observability
Partner with the Engineering Platform and Engineering Teams to embed observability, automation, and governance directly into CI/CD pipelines and service delivery processes
Establish enterprise-wide standards for incident management, escalation, communication, and governance across all CAPTG (Cox Automotive Product & Technology Group) teams
Represent Enterprise Operations in executive-level forums, articulating readiness posture, incident trends, and monitoring health
Serve as the executive incident commander as needed during major or business-critical outages, coordinating rapid technical recovery and engaging directly with senior leadership
Drive consistent use of the Incident Resolution Framework, ensuring data-driven root cause analysis and long-term prevention
Lead continual refinement of incident playbooks, automation, and communication protocols to accelerate mean time to resolve (MTTR)
Collaborate with Security, Platform Engineering, and Engineering Teams to ensure unified response and governance across CAPTG
Lead the modernization of Cox Automotive’s enterprise monitoring practice, building an integrated observability ecosystem that spans infrastructure, applications, and digital experiences
Partner with business, product, and engineering teams to define SLOs, SLIs, and error budgets that tie operational health to client outcomes and business value
Champion predictive analytics and automation to proactively identify and mitigate emerging risks before they impact customers
Lead, mentor, and grow a team of Incident Response Engineers, Observability Engineers, and Analysts
Build a culture of ownership, speed, and precision in both incident response and monitoring disciplines
Foster close collaboration with Platform, Reliability, and Security Engineering Teams to embed reliability as a shared responsibility
Reinforce NextGen Ops principles—empowering engineers, simplifying operations, and elevating reliability standards across Cox Automotive

Qualification

Incident Response LeadershipEnterprise Monitoring ModernizationObservability PlatformsCloud PlatformsAIOpsData-Driven IntelligenceCommunication SkillsTeam LeadershipCollaboration

Required

10+ years of experience in IT Operations, Site Reliability Engineering, or Platform Engineering; 5+ years leading enterprise-scale incident response or monitoring functions
Proven success leading and personally managing major incidents in distributed or hybrid cloud environments
Deep expertise in modern observability and monitoring platforms (Service Now, Splunk, New Relic,etc )
Strong understanding of event correlation, AIOps, and data-driven operational intelligence
Technical fluency across cloud platforms (AWS, GCP, Azure), infrastructure, and CI/CD ecosystems
Exceptional communication and composure under pressure; able to lead at both the executive and engineering levels
Bachelor's degree in Computer Science, Engineering, or related field (Master's preferred)

Preferred

Demonstrated experience modernizing enterprise monitoring or observability programs in large-scale environments
Experience implementing AI/ML-based monitoring or predictive analytics in operational contexts
Passion for building resilient systems, empowering engineering teams, and advancing client trust through operational excellence

Benefits

The flexibility to take as much vacation with pay as they deem consistent with their duties, the company’s needs, and its obligations
Seven paid holidays throughout the calendar year
Up to 160 hours of paid wellness annually for their own wellness or that of family members
Additional paid time off in the form of bereavement leave
Time off to vote
Jury duty leave
Volunteer time off
Military leave
Parental leave

Company

Manheim Dallas-Fort Worth

twitter
company-logo
Manheim continues to set the industry standard for buying and selling used vehicles today.

Funding

Current Stage
Growth Stage
Company data provided by crunchbase