Principal Site Reliability Engineer @ SentinelOne | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Principal Site Reliability Engineer jobs in United States
99 applicants
expire-info-iconThis job has closed.
company-logo

SentinelOne · 1 month ago

Principal Site Reliability Engineer

ftfMaximize your interview chances
Artificial Intelligence (AI)Cyber Security
check
Growth Opportunities
badNo H1BnoteU.S. Citizen OnlynoteSecurity Clearance Requirednote

Insider Connection @SentinelOne

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Design and guide the implementation of end-to-end alert correlation, auto-triage, and auto-remediation frameworks that meet the needs of a microservices-based SaaS architecture.
Ensure solutions align with business priorities and customer impact goals.
Define, implement, and monitor SLOs in collaboration with product and engineering teams.
Establish reliability standards that meet business and customer expectations, driving accountability and transparency around service performance.
Partner with software engineers, SREs, and data scientists to implement and refine monitoring, alerting, alert correlation, auto-remediation, and SLO solutions.
Lead initiatives to promote best practices and knowledge sharing across all of SentinelOne engineering.
Mentor engineers and contribute to a culture of reliability engineering excellence through thought leadership and guidance on advanced SRE principles and practices.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

SRE ExperienceIncident ManagementAlert CorrelationAutomated TriageSelf-Healing StrategiesSLO FrameworksObservability PlatformsPythonGoJavaMachine LearningData AnalysisAWSGCPAzureKubernetesTerraform

Required

U.S. Citizenship is required for this position.
Proven experience in architecting and implementing SRE solutions at scale within a microservices or distributed systems environment.
15+ years of progressive professional experience, with 5+ years of recent experience supporting enterprise SaaS environments (or equivalent combination of education, experience, and certifications).
Deep knowledge of incident management, alert correlation, automated triage, self-healing strategies, and SLO frameworks.
Strong understanding of observability platforms, including monitoring, logging, and tracing solutions.
Proficient in one or more programming languages (e.g., Python, Go, Java) with experience in automation and scripting for incident management workflows.
Experience with machine learning, anomaly detection, or data analytics techniques for real-time alert correlation and triage systems.
Expertise in cloud platforms (e.g., AWS, GCP, Azure) and container orchestration (e.g., Kubernetes), with experience in infrastructure-as-code (e.g., Terraform).
Ability to make critical architectural decisions with a focus on business impact, reliability, and system performance.

Benefits

Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
Unlimited PTO
Industry leading gender-neutral parental leave
Paid Company Holidays
Paid Sick Time
Employee stock purchase program
Disability and life insurance
Employee assistance program
Gym membership reimbursement
Cell phone reimbursement

Company

SentinelOne

company-logo
SentinelOne is an autonomous cybersecurity solution company.

Funding

Current Stage
Public Company
Total Funding
$696.52M
Key Investors
Tiger Global ManagementInsight PartnersRedpoint
2021-06-30Post Ipo Equity
2021-06-30IPO
2020-11-11Series F· $267M

Leadership Team

leader-logo
Tomer Weingarten
Co-Founder and CEO
linkedin
leader-logo
Wayne Phillips
Field CTO
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot