FedNow Lead Site Reliability Engineer jobs in United States
info-icon
This job has closed.
company-logo

Jobs via Dice · 6 hours ago

FedNow Lead Site Reliability Engineer

The Federal Reserve Financial Services (FRFS) is transforming into a national, enterprise-focused organization. They are seeking a Lead Site Reliability Engineer to operate the production environment for the FedNow program, architecting solutions for monitoring and tooling, and ensuring seamless automation across the platform.

Computer Software
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

As a Lead Engineer of the SRE / Production Operations team for FedNow, you will operate the production environment for the program
You will architect, implement, and leverage solution monitoring and tooling to be used for capacity planning, utilization reporting, and scaling
The team uses open source and proprietary software to support Engineering, DevOps, and DevSecOps tools, services, and solutions
CI/CD and IaC Pipeline automation design and development
Resiliency, DR and BCP (including testing)
The SRE / Production Operations team is part of the Technical Operations (TechOps) department and has the overall responsibility for the design, management and execution of operations required to support the ongoing technical and delivery needs of the FedNow Program, as well as the transition to production support and operations
This team interfaces with internal stakeholders, customers for planning, delivery, and service management
It owns ongoing ITIL processes, and the implementation and driving of continuous improvement initiatives
You will work closely with Engineers and Architects of the FedNow program in order to maintain seamless automation across the entire platform
Proactively identify suspected gaps in system architecture and design experiments to expose them
The ideal candidate is someone who loves building and maintaining reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available, high performing applications

Qualification

AWS environmentsCI/CD toolingInfrastructure automationMicro-services architecturePython scriptingDockerContainersObservability toolsFault Injection toolingCommunicationCollaboration skills

Required

Strong communication and collaboration skills
Extensive knowledge and understanding of working in AWS environments & services
EC2, EBS, EKS, RDS, Aurora, S3, Route 53, ELB, IAM, etc
Hashicorp Terraform, Consul, Vault, and Ansible
Automation experience preferably using GitLab
Experience with scripting languages preferably Python for automated processes
Experience working in Linux environment and shell scripting
Experience supporting infrastructure for large multi-services applications
Experience working with continuous deployment in micro-services architectures
Experience working with Docker, Containers, ECR and EKS
Observability - CloudWatch, OpenSearch, Dynatrace, Grafana, Prometheus
Familiarity with Fault Injection tooling (i.e. AWS Fault Injection Simulator, Gremlin, Chaos Toolkit, Chaos Monkey)
Automation mindset to enable consistency and dependability in common actions
All applicants must have resided in the United States for at least three (3) years

Company

Jobs via Dice

twitter
company-logo
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.

Funding

Current Stage
Early Stage
Company data provided by crunchbase