ScienceLogic · 8 hours ago
Senior Site Reliability Engineer
ScienceLogic is redefining IT operations for the modern enterprise. They are seeking a Senior Site Reliability Engineer to lead the design and buildout of secure systems for their Artificial Intelligence Product in SaaS, ensuring high availability and performance while automating workflows and resolving operational issues.
AnalyticsArtificial Intelligence (AI)Cloud Data ServicesCloud ManagementIT Management
Responsibilities
Lead design reviews and buildout of secure systems for delivering new Artificial Intelligence Product in SaaS, aiming for 99.99% uptime
Design, automate, test, and monitor the use of cloud native technologies as a foundation for a service platform
Spend 75% of your time on forward looking priorities designing and building SaaS systems while remaining on supporting the Operations and Maintenance of the current SaaS infrastructure
Investigate and resolve customer and operational issues with the mentality of fixing and not just mitigating issues
Identify and automate measurement of operations SLAs and SLOs
Triage incident response, document SOPs, Runbooks, and train NOC team members
Writing automation can be easily supported and extended by others
Collaborate across the organization to design, build and operationalize SaaS services conforming to various security standards like FedRAMP, SOC2, ISO etc
Participate in the on-call rotation as assigned
Take full responsibility for the availability and performance of the platform
Work on special projects as assigned
Qualification
Required
U.S. citizenship is required for this position
8-12 years of site reliability engineering, cloud operations or equivalent experience
Proven experience in managing complex Kubernetes environments in multiple Production systems
Working with Cloud Automation tools like CloudFormation, Terraform, aws-cli/CDK, Cloudformation
Scripting languages like Python, Bash, Perl etc
Exposure to Linux administration skills
Proven track record of operating production SaaS environments within security standards like FedRAMP, SOC2, ISO, PCI
Skilled at problem solving, algorithms, and data structures conforming to the modern SaaS security requirements
Building tools and scripting frameworks from scratch
Familiarity with basic networking, security and cloud engineering concepts
Highly collaborative with effective written and verbal communication skills
Ability to work against tight deadlines and occasionally after-hours, part of on-call scheduling
Occasionally work during off-hours and participate in weekly on-call schedule
Take full responsibility for the availability and performance of the platform
Bachelors or Master's degree in Computer Science, Information Systems or similar field
Benefits
A remote flexible workplace.
Comprehensive medical, dental and vision plans.
401(k) plan with employer match.
Flexible Paid Time Off (FTO) so that you can take the time that you need to re-energize.
Volunteer Time Off (VTO) - take two days off per calendar year to volunteer with your preferred charitable organization.
5-year Service Milestone Sabbatical.
Paid parental leave.
Generous employee referral bonus program.
Pet insurance.
HQ Office centrally located in Reston Town Center featuring a well-stocked kitchen with rotating snacks and beverages, and catered lunch on Thursdays.
Regular virtual company-wide events, including cooking classes, yoga, meditation and more.
The opportunity to learn and develop from some of the best and brightest minds in the industry!
Company
ScienceLogic
ScienceLogic is a provider of AI-driven monitoring solutions for hybrid cloud management.
Funding
Current Stage
Late StageTotal Funding
$235.17MKey Investors
Silver Lake WatermanGoldman SachsNew Enterprise Associates
2022-10-21Series Unknown· $21.17M
2021-02-23Series E· $105M
2018-11-05Debt Financing· $25M
Recent News
globalventuring.com
2025-12-05
2025-11-24
Company data provided by crunchbase