Cutover · 17 hours ago
Site Reliability Engineer
Cutover is a company that values inclusivity and empathy in the workplace. They are seeking a Site Reliability Engineer to ensure the reliability and performance of their production systems, collaborating closely with support and engineering teams to optimize the platform's reliability.
AnalyticsData IntegrationData VisualizationSaaSSoftware
Responsibilities
Incident Response: Respond to incidents and alerts, triaging urgency and investigating root cause
Documentation: Regular contributions to improve our documentation on system design, troubleshooting, best practices, and engineering processes
Root Cause Analysis: Contribute to post-mortems and help identify long-term improvements under guidance
Collaboration: Support cross-functional teams during investigations and post-incident reviews
Observability: Support and enhance observability tools and techniques by identifying metrics, logging, and alerting improvements
Automation: Write and execute simple automation scripts (e.g. Python, Ruby, Bash) to improve reliability and toil reduction
Development: Work on internal tools, pipelines, and IaC solutions to help improve the speed of software delivery and recovery
System Reliability: Work on efforts to enhance the reliability and performance of our application and systems, ensuring optimal uptime and minimal disruptions
Infrastructure Optimization: Work closely with the development and platform engineering teams to optimize the infrastructure on AWS, ensuring scalability and efficiency
Qualification
Required
A genuine excitement for complex problem solving within our tech stack, applying what you know to our unique problems
Familiarity with at least one scripting language such as Ruby, JavaScript, Python, Bash
Experience with containerization (i.e. Docker) or IaC (e.g. Terraform, Helm, CloudFormation)
An eagerness to follow modern engineering practices and learn from others
Familiarity with observability tools such as DataDog, New Relic, Grafana, Prometheus, ELK, or OpenTelemetry
Understanding of core networking concepts (DNS, HTTP/S, Load Balancing, etc.)
A collaborative mindset with clear communication skills
Willing to ask questions to gain a better understanding of new or complex concepts
Preferred
Exposure to major incident response processes
AWS Certified Cloud Practitioner or hands-on experience with cloud environments
Benefits
Share Options
20 days of PTO per year + public holidays , and we want you to take all of them!
3 volunteer days to use for any charitable/voluntary cause you would like.
A top-tier private health insurance package.
401k contribution plan
Work from home stipend
A personal learning and development budget through Learnerbly. You’ll be supported in your quest for knowledge, whatever that looks like to you.
If you’re thinking of starting or growing your family, then you’ll be in great company - more than half of our team are parents and we’ve built a globally consistent parental leave approach that we’re proud of.
Employee Referral Scheme.
Safeguarding the mental health of our teams is paramount for us. If you’d like to, then you’ll be able to avail yourself of multiple Cutover mental health initiatives , from fully subsidised therapy sessions to subscriptions to leading wellbeing platforms.
Company
Cutover
Cutover is an orchestration and observability platform that optimizes the planning of complex workflows.
H1B Sponsorship
Cutover has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (2)
2022 (2)
Funding
Current Stage
Growth StageTotal Funding
$54.62MKey Investors
Eldridge IndustriesIndex VenturesSussex Place Ventures
2021-03-03Series B· $35M
2019-11-12Series A· $17M
2016-06-13Seed· $2.5M
Recent News
outrunventures.com
2025-04-15
Company data provided by crunchbase