SentinelOne · 3 days ago
Engineering Manager, Site Reliability (SRE)
SentinelOne is redefining cybersecurity by leveraging AI-powered, data-driven innovation. They are seeking an experienced engineering and operational Manager to lead a Site Reliability Engineering (SRE) team, focusing on ensuring the reliability and scalability of their products and production services while collaborating with various engineering teams and customer-facing departments.
Artificial Intelligence (AI)Cyber SecurityNetwork SecuritySecurity
Responsibilities
Grow and lead a team of SRE professionals, including setting performance goals and measuring deliverables against key metrics, while evolving those metrics as S1 grows and needs develop
Invest in data-driven deep triage on recurring issues, collaborating with other engineering teams to identify and address issues related to reliability, performance, and capacity
Develop, improve, and implement processes for the full incident lifecycle, including incident management, post-incident analysis, and learning from incidents. Lead incident response efforts, including coordinating with other teams to investigate and resolve customer-impacting incidents
Design support model for SRE regarding service maturity and service ownership, including monitoring and alerting improvements, and SLI / SLO design and implementation
Analyze production metrics and signals to identify areas for improvement and take proactive steps to mitigate issues
Develop and implement best practices and standards for Site Reliability Engineering, from day-to-day operations to hiring and planning
Communicate effectively with cross-functional teams to ensure alignment on objectives and priorities. Deliver outcomes, not just stories and tasks
Qualification
Required
8+ years of related engineering experience, with at least 2 years in a management role
Demonstrated experience leading technical and operational teams at various stages of maturity
Excellent analytical and problem-solving skills
Familiarity with modern software development methodologies, tools, and techniques, including CI/CD
Experience working with cloud-native applications and large-scale distributed systems, including a working knowledge of technologies such as Kubernetes and Terraform/IaC, and cloud providers such as AWS or GCP
Experience with various monitoring and alerting techniques and tools, including frameworks and concepts such as SLOs, OTel and Golden Signals as well as tooling such as Prometheus and Grafana
Extensive experience with incident response and management at various layers of the stack across different business needs and applications, including both hands-on experience leading incidents/post-incident analysis and experience driving broader incident management initiatives
Ability to thrive in a fast-paced, dynamic environment
Benefits
Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
Unlimited PTO
Industry-leading gender-neutral parental leave
Paid Company Holidays
Paid Sick Time
Employee stock purchase program
Disability and life insurance
Employee assistance program
Gym membership reimbursement
Cell phone reimbursement
Numerous company-sponsored events, including regular happy hours and team-building events
Company
SentinelOne
SentinelOne is an autonomous cybersecurity solution company.
Funding
Current Stage
Public CompanyTotal Funding
$696.52MKey Investors
Tiger Global ManagementInsight PartnersRedpoint
2021-06-30Post Ipo Equity
2021-06-30IPO
2020-11-11Series F· $267M
Recent News
2026-01-07
redpoint.com
2026-01-05
2026-01-05
Company data provided by crunchbase