Senior Site Reliability Engineer @ EDZ Systems | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
Senior Site Reliability Engineer jobs in United States
200+ applicants
company-logo

EDZ Systems · 20 hours ago

Senior Site Reliability Engineer

Wonder how qualified you are to the job?

ftfMaximize your interview chances
ConsultingSoftware
check
Actively Hiring
Hiring Manager
Alex De Zeeuw
linkedin

Insider Connection @EDZ Systems

Discover valuable connections within the company who might provide insights and potential referrals, giving your job application an inside edge.

Responsibilities

As a Senior SRE, you will operate the production environment for the program.
You will architect, implement, and leverage solution monitoring and tooling to be used for capacity planning, utilization reporting, and scaling.
The team uses open source and proprietary software to support Engineering, DevOps, and DevSecOps tools, services, and solutions.
CI/CD and IaC Pipeline automation design and development.
Resiliency, DR and BCP (including testing)
The SRE / Production Operations team is part of the Technical Operations (TechOps) department and has the overall responsibility for the design, management and execution of operations required to support the ongoing technical and delivery needs of the Program, as well as the transition to production support and operations.
This team interfaces with internal stakeholders, customers for planning, delivery, and service management.
It owns ongoing ITIL processes, and the implementation and driving of continuous improvement initiatives.
You will work closely with Engineers and Architects of the program in order to maintain seamless automation across the entire platform.
Proactively identify suspected gaps in system architecture and design experiments to expose them
The ideal candidate is someone who loves building and maintaining reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available, high performing applications.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

AWS CloudPostgres SQLJavaAngularPythonDockerKubernetesBlockchainAWS S3API GatewayEMRKinesisLambdaEKSSQSAuroraDynamoDBSoftware engineering principlesTest automation toolsCOTS products integrationTroubleshootingRoot cause analysisIncident managementService request managementITILAgileProject ManagementSDLCHashicorp TerraformConsul

Required

Skills in AWS Cloud and Postgres SQL
Experience with Java, Angular, Python, Docker, Kubernetes, Blockchain
Experience with AWS technologies like S3, API Gateway, EMR, Kinesis, Lambda, EKS, SQS, Aurora, and DynamoDB
Experience in software engineering principles for designing, implementing, configuring, and optimizing applications and databases
Experience with test automation tools and integrations with Commercial Off the Shelf (COTS) products
Experience with troubleshooting, root cause analysis, incident and service request management, and providing on-call support
Strong communication and collaboration skills
Technical/functional expertise in tooling for ITIL, Agile, Project Management, and SDLC
Extensive knowledge and understanding of working in AWS environments & services
Experience with Hashicorp Terraform, Consul, Vault, and Ansible
Automation experience preferably GitLab
Experience with scripting languages preferably Python for automated processes
Monitoring/measuring of KPIs with focus on RCA and corrective action
Experience supporting infrastructure for large multi-services applications
Experience working with continuous deployment in micro-services architectures
Experience in fault injection/experimentation and system attacks
Familiarity with Fault Injection tooling (i.e. AWS Fault Injection Simulator, Gremlin, ChaosToolkit, Chaos Monkey)
Best practices in chaos engineering process and implementation (Chaos gamedays, business critical KPIs, etc.)
Observability with CloudWatch, Dynatrace, Grafana, Prometheus
Automation mindset to enable consistency and dependability in common actions
Test development and debugging experience

Preferred

Experience with CI/CD and IaC Pipeline automation design and development
Resiliency, DR, and BCP (including testing)
Experience with Confluence, Jira/Octane
Experience with EC2, EBS, RDS, Aurora, S3, Route 53, ELB, IAM, etc.
Experience in chaos engineering process and implementation

Company

EDZ Systems

twittertwittertwitter
company-logo
EDZ Systems is a certified minority- woman-owned small business providing software development and strategic consulting services.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Elizabeth DeZeeuw
President and Chief Executive Officer
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot