Senior Site Reliability Engineer (SRE) Senior Manager jobs in United States
cer-icon
Apply on Employer Site
company-logo

Accenture Federal Services · 3 hours ago

Senior Site Reliability Engineer (SRE) Senior Manager

Accenture Federal Services is dedicated to helping the US federal government enhance national security and improve lives through technology. The Senior Site Reliability Engineer (SRE) will design and maintain high-performance infrastructure while implementing best practices in service reliability and incident management.

ConsultingFinanceInformation TechnologyManagement ConsultingOnline PortalsProfessional Services
badNo H1BnoteSecurity Clearance RequirednoteU.S. Citizen Onlynote

Responsibilities

Design, build, and maintain reliable, scalable, and high-performance infrastructure and services to support business needs
Implement and advocate for SRE best practices, including automation, CI/CD pipelines, monitoring, and incident management
Collaborate with cross-functional teams to develop systems that meet high availability, performance, and reliability standards
Drive incident management processes, including root cause analysis, mitigation strategies, and long-term preventive measures
Establish, monitor, and refine service level objectives (SLOs), service level agreements (SLAs), and key performance indicators (KPIs) to ensure systems adhere to reliability and performance targets
Automate repetitive tasks to improve operational efficiency and reduce manual intervention
Build and maintain robust monitoring, logging, and alerting systems to ensure visibility into system performance and reliability
Provide technical mentorship and guidance to team members, fostering a culture of knowledge sharing and continuous improvement
Act as a technical leader by driving solutions to complex challenges, ensuring alignment with organizational goals
Prepare and deliver performance and reliability reports to stakeholders, offering insights and recommendations for improvements

Qualification

Site Reliability EngineeringAutomation ToolsCloud PlatformsMonitoring ToolsIncident ManagementInfrastructure-as-CodeAnalytical SkillsCommunication SkillsCollaboration SkillsTechnical Mentorship

Required

Proven experience in site reliability engineering or a similar role, with a focus on application and infrastructure scalability, reliability, and performance
Strong knowledge of ITSM principles and incident management processes
Expertise in automation tools, scripting, and infrastructure-as-code (IaC) technologies
Proficiency with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk)
Experience with cloud platforms (e.g., AWS, Azure, GCP) and container technologies (e.g., Docker, Kubernetes)
Strong analytical and problem-solving skills, with the ability to troubleshoot complex systems
Excellent communication and collaboration abilities, with a focus on cross-team partnerships
A passion for continuous learning, innovation, and driving improvements in reliability and efficiency

Preferred

Advanced Degree
15+ years of industry experience
Motivated and proactive, with a desire to understand and address complex areas
Curiosity for learning about new technology, industry best practices, and areas of risk, analyzing and turning new insights into concrete action
Commitment to delivering tangible outcomes for customers and stakeholders
Strong written and verbal communication/interpersonal skills to effectively collaborate with cross-functional teams and stakeholders
Excellent people management and relationship development skills
In-depth knowledge of Accenture delivery methodologies and practices

Company

Accenture Federal Services

company-logo
Accenture Federal Services is a leading US federal services company and subsidiary of Accenture.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Ron Ash
Chief Operating Officer
linkedin
leader-logo
Bharat Patel
Managing Director, AI Missions
linkedin
Company data provided by crunchbase