Ampstek · 9 hours ago
Site Reliability Engineer(SRE)/Cloud Platform Architect only USC and GC w2
Ampstek is seeking a Site Reliability Engineer (SRE)/Cloud Platform Architect to join their team in Charlotte, NC. The role involves designing, implementing, and maintaining highly available systems while ensuring their performance and reliability through various observability tools and automation processes.
Responsibilities
Design, implement, and maintain highly available and scalable systems
Monitor system performance, availability, and reliability using observability tools
Automate infrastructure provisioning, deployments, and operational tasks
Manage incident response, root cause analysis (RCA), and postmortems
Improve system reliability through SLOs, SLIs, and error budgets
Optimize system performance, capacity planning, and cost efficiency
Maintain CI/CD pipelines and deployment strategies
Ensure system security, compliance, and disaster recovery readiness
Collaborate with software engineers to improve application reliability
Qualification
Required
Kubernetes
Openshift
Terraform
AWS
Azure
CI/CD
Jenkins
Github
Git lab
Docker
Prometheus
Grafana
Python
Shellscripting
Bachelor's degree in Computer Science, Engineering, or equivalent experience
Strong experience with Linux/Unix system administration
Proficiency in cloud platforms (AWS, Azure, or GCP)
Hands-on experience with containerization (Docker, Kubernetes)
Experience with Infrastructure as Code (IaC) tools (Terraform, CloudFormation)
Strong scripting skills (Python, Bash, Go, or similar)
Knowledge of monitoring & alerting tools (Prometheus, Grafana, Datadog, New Relic)
Experience with CI/CD tools (Jenkins, GitHub Actions, GitLab CI)
Understanding of networking, DNS, load balancing, and security concepts
Preferred
Experience with microservices and service meshes
Knowledge of chaos engineering tools
Experience implementing reliability best practices (SRE principles)
Cloud security and compliance experience
Certifications: AWS/GCP/Azure, Kubernetes (CKA/CKAD)