Leadership Triangle · 6 hours ago
Director, Site Reliability Engineering
Fidelity is a privately held company focused on making financial expertise broadly accessible. The Director of Site Reliability Engineering will manage and lead a team of SREs and Production Support Engineers, ensuring the reliability and availability of Fidelity’s systems through automation and best practices in resiliency engineering.
ConsultingNon ProfitTraining
Responsibilities
Help define and execute a comprehensive reliability and observability strategy, ensuring that Fidelity’s systems are always available when our customers need them
Bring together technical, procedural, and financial data to reduce toil and increase efficiency
You will execute plans for technical standardization and process refinement within the engineering organization, especially for Site Reliability Engineers & Production Support Engineers
Troubleshoot stack-wide engineering issues related to hardware, software, network, applications, and cloud service providers
Qualification
Required
Bachelor's degree or higher in a technology related field (e.g. Engineering, Computer Science, etc.) required
10+ years of hands-on experience deploying and/or supporting highly distributed multi-tiered systems at scale
5+ years of experience with AWS
3+ years of experience with Kubernetes container orchestration (EKS)
Experience operating and implementing distributed & highly concurrent service-based architectures, including microservices, containerized services, and/or serverless architectures
Thought leadership and an ability to plan and drive complex initiatives using agile principles
Ability to triage, execute root cause analysis, and be decisive under pressure
Strong understanding across cloud infrastructure components (server, storage, network, data, and applications) to deliver end-to-end Cloud Infrastructure architectures and designs
Solid understanding of Cloud Computing and DevOps concepts including CI/CD pipelines
Proven experience in implementing advanced observability practices and techniques at scale
Demonstrated ability to utilize modern monitoring tools (Datadog, Prometheus, Splunk)
Proficient communication skills with an ability to reach both technical, non-technical audience
Ability to work with a variety of individuals and groups, both in person and virtually, in a constructive and collaborative manner and build and maintain effective relationships
Help define and execute a comprehensive reliability and observability strategy, ensuring that Fidelity's systems are always available when our customers need them
Bring together technical, procedural, and financial data to reduce toil and increase efficiency
You will execute plans for technical standardization and process refinement within the engineering organization, especially for Site Reliability Engineers & Production Support Engineers
Troubleshoot stack-wide engineering issues related to hardware, software, network, applications, and cloud service providers
Preferred
Master's degree
AWS Certifications
Ability to automate with various scripting languages (Python, Shell scripting, etc.)
Experience managing systems using infrastructure as code tools (IAM, Terraform)
Benefits
401(k) with company match
Medical, dental, vision and prescription drug coverage
16-week maternity leave & 12-week parental leave
Student loan assistance
Company
Leadership Triangle
Leadership Triangle educates and promotes regionalism across the separate communities.
Funding
Current Stage
Early StageCompany data provided by crunchbase