Ensemble Health Partners · 2 weeks ago
Lead Site Reliability Engineer
Maximize your interview chances
Health CareHospitality
Insider Connection @Ensemble Health Partners
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Lead the SRE team in building and maintaining resilient and scalable cloud infrastructure on Azure, adhering to high standards of uptime, performance, and reliability.
Design and implement comprehensive observability systems, including monitoring and alerting, ensuring 100% coverage of critical metrics. Defining 'what good looks like' for implementing observability.
Collaborate with partner teams (product, data engineering, IT operations) to integrate reliability and monitoring practices across the entire development lifecycle.
Define and enforce standards for service level indicators (SLIs), service level objectives (SLOs), and service level agreements (SLAs) for all services.
Proactively identify and address reliability risks before they impact production, including leading performance testing and disaster recovery drills.
Oversee incident management processes, including root cause analysis, post-mortems, and ensuring implementation of action items to prevent recurrence.
Drive the development and maintenance of reusable platform capabilities for engineering enablement, including network optimization, multi-region support, and automated deployment pipelines.
Provide technical mentorship to the SRE team and ensure continuous improvement in production reliability and observability.
Collaborate with senior management on regular reporting of system reliability and incident response outcomes.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
7 to 10 Years of work experience.
Bachelors Degree or Equivalent Experience in Computer Science.
Preferred
Extensive experience with Azure cloud services (App Services, Azure SQL, Networking, Monitoring) and proven ability to manage scalable and resilient systems.
Leadership experience in SRE or DevOps roles with a focus on production reliability, observability, and platform optimization.
Strong expertise in infrastructure as code (e.g., Bicep, ARM templates, Terraform) and automation tools (e.g., Azure DevOps, Prometheus, Grafana).
Experience with incident management, troubleshooting, performance optimization, and disaster recovery planning.
Excellent communication and collaboration skills, with a track record of working across teams to implement best practices in reliability and observability.
Benefits
Comprehensive benefits package designed to support the physical, emotional, and financial health of you and your family, including healthcare, time off, retirement, and well-being programs
Professional certification relevant to their field
Tuition reimbursement
Quarterly and annual incentive programs for all employees who go beyond and keep raising the bar for themselves and the company
Company
Ensemble Health Partners
Ensemble Health Partners is the leading revenue cycle management company for hospitals, health systems and physician practices.
Funding
Current Stage
Late StageTotal Funding
unknown2022-03-25Private Equity
2019-05-30Acquired
Leadership Team
Recent News
Morningstar, Inc.
2024-11-01
Company data provided by crunchbase