107 applicants

Company

Original Job Post

GEICO · 2 days ago

Senior Manager, Site Reliability Engineering (SRE) for Hybrid Cloud - Infrastructure as a Service (IaaS)

Chevy Chase, MD

Full-time

Hybrid

Director

$115K/yr - $262K/yr

Wonder how qualified you are to the job?

Maximize your interview chances

Auto InsuranceFinancial Services

Actively Hiring

Insider Connection @GEICO

Discover valuable connections within the company who might provide insights and potential referrals, giving your job application an inside edge.

Responsibilities

Provide strategic direction and technical leadership in the design, development, and deployment of a robust, reliable, and scalable digital infrastructure.

Drive the architecture, design, and optimization of highly available, scalable, and fault-tolerant systems and services supporting the digital engineering team.

Build and nurture a high-performing SRE team, providing mentorship, coaching, and guidance to foster a culture of continuous learning and improvement.

Work closely with all GEICO Tech products and platforms to manage, innovate and create new programs, software and analytics that improve the availability, scalability, latency and effectiveness of GEICO products and services.

Collaborate with cross-functional leaders including product area leads to guide product engineering to build reliable and durable production systems and contribute to the strategic direction of the Tech organization.

Present a reliability vision and strategic recommendations with clarity and concision to stakeholders having varying degrees of SRE fluency.

Develop and own relationships with technology and business partners.

Foster effective collaboration and communication across cross-functional teams to align priorities, share best practices, and ensure smooth coordination for incident response, system maintenance, and upgrades.

Manage department budgets, resource allocation, and vendor relationships to optimize costs and maintain high-quality outcomes.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

SRE PracticesPublic CloudHybrid Cloud ArchitectureIaaS TechnologiesContainer OrchestrationIncident ManagementPerformance TuningCapacity PlanningIncident ResponseMonitoring ToolsInfrastructure AutomationConfiguration ManagementCloud SecurityCompliance StandardsOpen-Source ManagementBudget ManagementVendor CollaborationLeadershipTeam ManagementMentoringProblem-SolvingAnalyticalDetail-OrientedAWS Certified DevOps EngineerGoogle Professional DevOps EngineerRelevant cloud provider certifications

Required

Bachelor's degree in Computer Science, Information Technology, or a related field (Master's degree preferred).

Proven experience in a leadership role focused on software defined and software driven data center and network engineering within complex, large-scale production environments.

Deep knowledge of SRE practices, methodologies, and principles, along with a solid understanding of on-prem and public cloud-based network, compute, and storage technologies.

In-depth knowledge of hybrid cloud architecture, IaaS technologies, container orchestration platforms (e.g., Kubernetes), cloud efficiency, and observability, etc.

Strong background in incident management, performance tuning, and capacity planning, including creating incident response playbooks, incident triaging strategies, and post-incident analysis to drive continuous improvement in system reliability and availability.

Experience with open-source management and monitoring tools (e.g., Cacti, Zabbix, Splunk, Prometheus, Grafana).

Experience with infrastructure automation, tooling, and configuration management frameworks (e.g., Puppet, Chef, Ansible, Terraform, etc.).

Familiarity with cloud security best practices and compliance standards.

Excellent leadership and team management skills with a passion for mentoring and fostering professional growth.

Strong problem-solving and analytical abilities, with a keen eye for detail and a passion for driving operational efficiency.

Experience in budget management, resource allocation, and vendor collaboration.

Preferred

Certifications such as AWS Certified DevOps Engineer, Google Professional DevOps Engineer, or relevant cloud provider certifications are a plus.

Benefits

Premier Medical, Dental and Vision Insurance with no waiting period

Paid Vacation, Sick and Parental Leave

401(k) Plan

Tuition Reimbursement

Paid Training and Licensures

Company

GEICO

Glassdoor

2.7

GEICO, Government Employees Insurance Company, has been providing affordable auto insurance since 1936. It is a sub-organization of Berkshire Hathaway.

Founded in 1936