Site Reliability Engineer (Application SRE) jobs in United States
cer-icon
Apply on Employer Site
company-logo

The Dignify Solutions, LLC · 1 month ago

Site Reliability Engineer (Application SRE)

The Dignify Solutions, LLC is seeking a Site Reliability Engineer to enhance their application reliability and performance. This role involves mapping application deployment architectures, implementing SRE practices, and ensuring adherence to AWS security best practices.

Bookkeeping and PayrollHuman ResourcesRecruitingStaffing AgencyTraining

Responsibilities

Map an applications deployment architecture including cloud infrastructure and dependencies
Experience with Chaos testing scenarios (using Gremlin preferably)
Ability to identify Failure Modes with end to end journeys (across UI, authentication layer, Application code, 3rd party systems, Databases, Data, Capacity , Infrastructure , Firewall and Network)
Integrate SRE practices into Incident Management and Change Deployment process
Implementation of SRE practices inline with AWS security best practices and Well Architected Frameworks
Develop and Maintain SRE runbooks
Understand and share resiliency architectures
Strong understanding of SLO, SLI, Error Budgets and their implementation into SRE areas

Qualification

Java (JDK 8+)AWS servicesMicroservices architectureCloud-native ArchitectureUI development AngularUI development Node JSAWS Certified Solution ArchitectShell scriptingPowerShellPythonScalaMonitoring tools SplunkMonitoring tools GrafanaChaos testing

Required

Demonstrable experience on Java (JDK 8+) and Microservices architecture
Hand-on experience in AWS services (EC2, ECS, S3, Cloud Formation template, Aurora DB, Dynamo DB. Lambda, SQS, SNS, RDS, API Gateway, VPC, Route 53, Kinesis, Cloudwatch AWS SDK)
Experience with monitoring and testing subsystems with Splunk, Honeycomb, Open Telemtry and Grafana
Experience with UI development tools (Angular and Node JS)
5+ years of direct implementation of AWS Architecture solutions
Programming experience of either Shell, PowerShell, Python, Java or Scala
4+ years of hands-on experience in cloud-native Architecture design, implementation of distributed fault tolerant enterprise application for cloud
Experience with Chaos testing scenarios (using Gremlin preferably)
Ability to identify Failure Modes with end to end journeys (across UI, authentication layer, Application code, 3rd party systems, Databases, Data, Capacity , Infrastructure , Firewall and Network)
Integrate SRE practices into Incident Management and Change Deployment process
Implementation of SRE practices inline with AWS security best practices and Well Architected Frameworks
Develop and Maintain SRE runbooks
Understand and share resiliency architectures
Strong understanding of SLO, SLI, Error Budgets and their implementation into SRE areas

Preferred

Preferably AWS Certified Solution Architect - Professional

Company

The Dignify Solutions, LLC

twittertwitter
company-logo
The Dignify Solutions with Global Capabilities and Local Excellence – has combined experience of 30 +years in Client Services/ Engagement/ Relationship/ Partnership, Sales/ Account Management, Service Delivery, Recruiting, Staffing and Talent Acquisition for the whole gamut of skillsets in Information Technology (Digital Transformation, Artificial Intelligence, Machine Learning and other business domains).

Funding

Current Stage
Growth Stage
Company data provided by crunchbase