This job has closed.

Jobs via Dice · 6 hours ago

FedNow Lead Site Reliability Engineer

Boston, MA

Full-time

Onsite

Senior Level, Lead/Staff

$153K/yr - $253K/yr

The Federal Reserve Financial Services (FRFS) is transforming into a national, enterprise-focused organization. They are seeking a Lead Site Reliability Engineer to operate the production environment for the FedNow program, architecting solutions for monitoring and tooling, and ensuring seamless automation across the platform.

Computer Software

No H1B

U.S. Citizen Only

Responsibilities

As a Lead Engineer of the SRE / Production Operations team for FedNow, you will operate the production environment for the program

You will architect, implement, and leverage solution monitoring and tooling to be used for capacity planning, utilization reporting, and scaling

The team uses open source and proprietary software to support Engineering, DevOps, and DevSecOps tools, services, and solutions

CI/CD and IaC Pipeline automation design and development

Resiliency, DR and BCP (including testing)

The SRE / Production Operations team is part of the Technical Operations (TechOps) department and has the overall responsibility for the design, management and execution of operations required to support the ongoing technical and delivery needs of the FedNow Program, as well as the transition to production support and operations

This team interfaces with internal stakeholders, customers for planning, delivery, and service management

It owns ongoing ITIL processes, and the implementation and driving of continuous improvement initiatives

You will work closely with Engineers and Architects of the FedNow program in order to maintain seamless automation across the entire platform

Proactively identify suspected gaps in system architecture and design experiments to expose them

The ideal candidate is someone who loves building and maintaining reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available, high performing applications

Qualification

AWS environmentsCI/CD toolingInfrastructure automationMicro-services architecturePython scriptingDockerContainersObservability toolsFault Injection toolingCommunicationCollaboration skills

Required

Strong communication and collaboration skills

Extensive knowledge and understanding of working in AWS environments & services

EC2, EBS, EKS, RDS, Aurora, S3, Route 53, ELB, IAM, etc

Hashicorp Terraform, Consul, Vault, and Ansible

Automation experience preferably using GitLab

Experience with scripting languages preferably Python for automated processes

Experience working in Linux environment and shell scripting

Experience supporting infrastructure for large multi-services applications

Experience working with continuous deployment in micro-services architectures

Experience working with Docker, Containers, ECR and EKS

Observability - CloudWatch, OpenSearch, Dynatrace, Grafana, Prometheus

Familiarity with Fault Injection tooling (i.e. AWS Fault Injection Simulator, Gremlin, Chaos Toolkit, Chaos Monkey)

Automation mindset to enable consistency and dependability in common actions

All applicants must have resided in the United States for at least three (3) years

Company

Jobs via Dice

Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.

0-1 employees

https://www.dice.com

Funding

Current Stage

Early Stage

Company data provided by crunchbase