Jobs via Dice · 1 hour ago
AWS Cloud DevOps / Site Reliability Engineer (SRE)
Jobs via Dice is seeking a skilled and proactive AWS Cloud DevOps / Site Reliability Engineer (SRE) to join their team. This role combines software engineering and cloud operations expertise to build and maintain scalable, secure, and reliable cloud infrastructure using AWS services.
Computer Software
Responsibilities
Establish and maintain efficient and reliable Azure DevOps CI/CD pipelines to facilitate seamless integration between environments
Manage source code repositories with version control tools, follow branching strategies and release management
Implementing and managing infrastructure using Terraform and Terragrunt as Infrastructure as Code (IaC)
Integrate IaC workflows into CI/CD pipelines for seamless, automated deployments
Manage scalable, secure, and highly available AWS infrastructure using services like Lambda, CloudWatch, EC2, S3, RDS, DynamoDB, API Gateway and VPC
Maintain reusable Terragrunt and Terraform modules/templates for consistent infrastructure patterns
Monitoring and alerting systems to ensure high availability and resilience of application and infrastructure through automation, alerting, and auto-healing mechanisms
Identify performance bottlenecks, optimize system resources, and implement scaling strategies to support growing demands
Troubleshoot infrastructure and application issues, perform root cause analysis, and drive incident response
Implement security best practices, conduct vulnerability assessments, IAM roles/policies, and address security incidents promptly
Continuously evaluate existing systems, tools, and processes to identify areas for improvement
Recommend and implement enhancements to optimize efficiency, reliability, and scalability
Create and maintain documentation related to infrastructure, processes, and best practices
Collaborate with developers to support deployment, performance, and reliability of services
Conduct incident response and root cause analysis for infrastructure issues
Optimize cloud infrastructure for performance and cost
Create documentation and runbooks for operational procedures and troubleshooting
Participate in on-call rotation for SRE Support
Qualification
Required
Hands-on experience in a DevOps or SRE role
Hands-on expertise with CloudWatch, Splunk, & Dynatrace
Strong experience with AWS services (e.g., S3, RDS, Lambda, API Gateway, VPC, IAM, Event bridge, Serverless functions etc.)
Experience With Infrastructure As Code (Terraform Preferred)
Proficiency with scripting languages (e.g., Python, typescript, Boto3)
Strong knowledge of CI/CD tools and processes
Experience with observability and monitoring tools
Understanding of networking, security, and cloud cost management
Excellent communication and collaboration skills
Ability to work independently and in a team-oriented environment
Strong problem-solving and debugging skills
Passion for automation and improving system reliability
Company
Jobs via Dice
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.
Funding
Current Stage
Early StageCompany data provided by crunchbase