MaTi Group Inc · 5 hours ago
Site Reliability Engineer (SRE)
MaTi Group Inc. is a leading organization specializing in talent acquisition and project development services. They are seeking a Site Reliability Engineer (SRE) to maintain the reliability and availability of systems, troubleshoot technical issues, and develop software to improve system operations.
Computer Software
Responsibilities
Design, build, and operate highly scalable, reliable, and available cloud platforms using AWS and Azure
Apply SRE principles including SLIs, SLOs, and error budgets to balance system reliability and feature velocity
Architect and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, and automation best practices
Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, and AWS CDK for global infrastructure automation
Lead incident response and on-call operations, following ITIL frameworks and managing workflows in ServiceNow
Perform Root Cause Analysis (RCA) and maintain detailed post-incident documentation and knowledge bases
Drive performance, capacity planning, and resiliency testing to ensure durability of mission-critical systems
Optimize cloud cost management, autoscaling thresholds, and resource utilization across environments
Implement advanced observability, monitoring, and distributed tracing using Dynatrace and Kibana
Build intelligent dashboards and enable proactive anomaly detection to reduce MTTR
Manage Linux-based systems, networking fundamentals, and relational & NoSQL databases
Support containerized workloads using Docker and orchestration via Kubernetes or Amazon ECS
Develop automation and tooling using Python or similar scripting languages
Enforce security and compliance best practices, including service accounts, certificate management, and rapid remediation
Collaborate cross-functionally with development, operations, and security teams, demonstrating strong communication and ownership
Qualification
Required
Proficiency in Site Reliability Engineering practices and experience in troubleshooting complex technical issues
Software Development skills with a strong understanding of programming languages and frameworks
Hands-on experience with System Administration, including deploying, configuring, and maintaining systems
Knowledge of Infrastructure and cloud technologies to support scalable and reliable systems
Strong problem-solving skills and attention to detail
Ability to collaborate effectively within a hybrid work environment
Relevant certifications in cloud platforms or system administration are advantageous
AWS
Azure
Terraform
CloudFormation
GitHub Actions
CI/CD
SRE
SLIs
SLOs
Error Budgets
Dynatrace
Kibana
ServiceNow
ITIL
Python
Linux
Docker
Kubernetes
ECS
Networking
Databases
Incident Management
Preferred
Familiarity with DevOps tools and practices is a plus
Company
MaTi Group Inc
Welcome to Mati Inc. Your premier partner in talent acquisition and project development services.
Funding
Current Stage
Growth StageCompany data provided by crunchbase