Apply on Employer Site

Leidos · 16 hours ago

SRE Technical Manager - Transport

Norfolk, VA

Full-time

Onsite

Director/Executive

$116K/yr - $210K/yr

8+ years exp

Leidos is an industry and technology leader serving government and commercial customers with smarter, more efficient digital and mission innovations. They are seeking a highly skilled and experienced SRE Technical Manager to lead their Transport Site Reliability Engineering team, managing a group of engineers to ensure the reliability, performance, and scalability of critical systems.

ComputerGovernmentInformation ServicesInformation TechnologyNational SecuritySoftware

No H1B

Security Clearance Required

U.S. Citizen Only

Responsibilities

Manage and mentor 5-6 SRE teams (pods) and 60+ FTEs, providing guidance, setting performance expectations, and fostering professional development

Work collaboratively with SRE Resource Managers to staff and maintain engineering resources for your SRE vertical teams' reliability and scalability goals

Responsible for the P&L across the Transport Services vertical. Manage the SRE team’s resources, including budget planning, tool selection, and infrastructure investments to meet reliability and scalability needs

Meet regularly with your team members, participate in performance reviews and interviews, and development planning

Oversee the reliability, availability, and performance of critical systems by leading the SRE teams within the data center vertical in implementing monitoring, incident response, and performance optimization strategies

Ensure the team adheres to best practices for system reliability, automation, and operational efficiency

Drive continuous improvement initiatives by analyzing performance metrics (e.g., SLOs, MTTR, MTBF) and identifying areas for enhancement

Collaborate with operations, quality, cybersecurity and other SRE engineering teams to define and enforce Service Level Objectives (SLOs) and manage error budgets

Act as a liaison between the SRE team and other departments to prioritize reliability and operational needs in the product development process

Collaborate with senior leadership to define the SRE strategy, set long-term reliability goals, and ensure alignment with business objectives

Lead efforts to reduce operational toil through automation. Work with the team to build or enhance automation tools that manage infrastructure, monitor systems, and respond to incidents

Oversee the development and adoption of Infrastructure as Code (IaC) tools, CI/CD pipelines, and other automation processes

Ensure that SRE practices align with organizational security policies and compliance requirements

Collaborate with security teams to integrate reliability-focused security practices into the design and operation of systems

Ensure systems meet or exceed agreed-upon service levels by proactively addressing potential issues and working with stakeholders to align on reliability expectations

Work within a SRE team, collaborating with other Developers, Security, and Operations, to continuously deliver products and increase the value stream for the organization and customers

Embrace and champion Agile development processes and adoption to modern Site Reliability Engineering workflows and practices while providing technical guidance to team members and coworkers on best practices

Stay up to date on the latest Site Reliability Engineering practices and technologies

Strive to provide internal and external customers with excellent customer service and world-class service

Resolve most conflicts between timeline, budget, and scope independently but intuitively raise sophisticated or consequential issues to senior management

Qualification

SRE principlesInfrastructure as CodeCloud infrastructure AWSCloud infrastructure AzureAgile/DevOps processesDoD 8570.01 IAT Level IIAutomation toolsCI/CD pipelinesIncident managementTeam managementCommunication skillsCollaborationProblem-solving

Required

Requires B.S. Degree (or equivalent) in Cybersecurity, Information Security, IT, Network Engineering, Computer Science, or related field or Master's with 6+ years of prior relevant experience with 8-10 years of SRE or DevOps experience and at least 4 years in a leader or manager capacity

US Citizen with DoD Secret Clearance

Minimum of DoD 8570.01 IAT Level II Certification required prior to onboarding and must maintain certification while supporting the SMIT Contract

Must be able to support program execution in classified environments and access SIPRNet from an NMCI location on short notice (local travel)

Exceptional written and oral communication skills include producing technical analysis/reports, presentations and executive level briefings with internal and external stakeholders

Ability to review requirements, comprehend, and solution capabilities that satisfy customer requirements

Ability to work in a highly collaborative, forward thinking, and innovation-driven environment

Proven experience managing teams responsible for large-scale, distributed systems with high reliability and performance demands

Strong track record of managing incidents, conducting postmortems, and implementing reliability improvements

Experience implementing and managing Agile or DevOps processes, with a focus on continuous improvement, efficiency, and team productivity

Ability to lead teams through strategic initiatives such as reliability maturity assessments, process automation, and tooling selection

Solid understanding of SRE principles, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgeting

Experience with commercial cloud infrastructure deployment environments such as AWS and Azure

Strong knowledge of automation tools, CI/CD pipelines, and Infrastructure as Code (IaC)

Experience with Agile and DevSecOps/SRE concepts and best practices

Hand-on experience with Atlassian products (Jira, Confluence, Bitbucket, etc.)

Experience creating JIRA and/or Azure DevOps workflows, projects, custom configurations

Solid experience with integrating/maintaining with various 3rd party CI/CD tools like Jenkins and Gitlab

Experience with automated provisioning and configuration tools like Terraform, Cloud Formation, Ansible, or similar technologies

Working knowledge of the Risk Management Framework (RMF), DISA STIGs

Preferred

Previous work experience providing support to the NGEN-NMCI program is highly desired

Experience with microservices architecture and distributed systems

Familiarity with serverless and event-driven architectures

Certification in cloud platforms (e.g., Azure Certified DevOps Engineer)

Experience in high-growth environments or managing teams during significant scaling periods

ITILv4 and Agile SAFe certifications or applicable experience

Benefits

Health and Wellness programs

Income Protection

Paid Leave

Retirement

Company

Leidos

Glassdoor3.9

Leidos is a Fortune 500® innovation company rapidly addressing the world’s most vexing challenges in national security and health.

Founded in 1969

Reston, Virginia, USA

10001+ employees

https://www.leidos.com/

Funding

Current Stage

Public Company

Total Funding

unknown

2025-02-20Post Ipo Debt

2013-09-17IPO

Leadership Team

James Carlini

Chief Technology Officer

Theodore Tanner

Chief Technology Officer

Recent News

Fortune

Fortune 500 Power Moves: Which executives gained and lost power this week

2025-12-20

MarketScreener

Leidos Names Theodore Tanner Chief Technology Officer

2025-12-16

Benzinga.com

Citigroup, Leidos Holdings And More On CNBC's 'Final Trades'

2025-12-16

Company data provided by crunchbase