KYYBA Inc · 1 month ago
SRE Devops
KYYBA Inc is seeking an experienced SRE DevOps professional. The role involves implementing observability frameworks, ensuring platform stability, and managing the problem management lifecycle to enhance operational resilience.
AutomotiveConsultingStaffing Agency
Responsibilities
Implement and enhance proactive observability frameworks to anticipate and mitigate issues before they occur
Optimize experience monitoring and user interaction metrics across applications and services
Manage and improve the event catalog, ensuring all system events are structured and actionable
Build and maintain dashboards, alerts, and health reporting using tools like Dynatrace, BigPanda, MonPro, and LogScale
Perform service tuning to improve system performance based on real-time metrics and data analysis
Establish and maintain observability standards and best practices across teams
Conduct chaos testing and resilience validation to ensure high system availability
Lead anomaly detection practices to quickly identify and respond to unusual system behavior
Ensure platform stability, performance, and reliability through proven reliability engineering principles
Drive SRE initiatives, including continuous improvement projects within the Site Reliability Center
Develop, maintain, and scale automated orchestration pipelines to streamline operations and improve efficiency
Create, maintain, and enforce SRE standards, including SLIs, SLOs, and operational playbooks
Lead and conduct root cause analysis for critical incidents and drive long-term remediation improvements
Own the problem management lifecycle—identifying, tracking, and resolving underlying issues to prevent recurring incidents
Collaborate with cross-functional teams to address systemic issues and drive operational resilience
Qualification
Required
7+ years of experience in SRE, DevOps, or Infrastructure Engineering roles
Hands-on expertise with observability/monitoring tools such as Dynatrace (APM, RUM, dashboards, alerting), BigPanda (event correlation, incident response), LogScale / MonPro / LogicMonitor or similar log and metrics platforms
Solid experience with cloud platforms (AWS, Azure, or GCP)
Strong proficiency in automation & orchestration (Terraform, Ansible, Jenkins, GitHub Actions, etc.)
Proven track record in incident management, RCA, and implementing reliable SRE practices
Experience with CI/CD pipelines, infrastructure as code, and configuration management
Deep understanding of Linux systems, networking fundamentals, and distributed system design
Strong scripting abilities (Python, Bash, PowerShell, or equivalent)
Excellent communication, leadership, and cross-team collaboration skills
Preferred
Experience leading SRE or DevOps teams
Knowledge of chaos engineering, advanced anomaly detection, and proactive alerting strategies
Experience implementing SLI/SLO frameworks and performance optimization programs
Familiarity with containerization (Docker, Kubernetes) and service meshes
Company
KYYBA Inc
Kyyba, Inc.
H1B Sponsorship
KYYBA Inc has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (55)
2024 (85)
2023 (79)
2022 (95)
2021 (78)
2020 (71)
Funding
Current Stage
Late StageRecent News
Company data provided by crunchbase