Ambry Genetics · 7 hours ago
Sr. Site Reliability Engineer-Remote USA
Maximize your interview chances
Health CareHealth Diagnostics
H1B Sponsor Likely
Insider Connection @Ambry Genetics
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Design and implement a comprehensive observability strategy using Datadog to provide a single pane of glass across development, operations, infrastructure, data, and database workloads
Develop and maintain sophisticated alerting frameworks that minimize alert fatigue while ensuring critical issues are detected early
Create and optimize SLIs, SLOs, and error budgets across services
Implement automated remediation workflows for common failure scenarios
Work with development teams to implement proper instrumentation, logging, and monitoring best practices
Lead incident response, postmortem analyses, and implement systematic improvements
Design and maintain dashboards that provide actionable insights for different stakeholder groups
Automate toil reduction through infrastructure as code and monitoring as code practices
Other duties as assigned
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
5+ years of hands-on SRE experience in largescale production environments
Deep expertise with Datadog, including APM, Infrastructure Monitoring, Log Management, and Synthetic Monitoring
Strong experience writing and optimizing monitoring as code using Terraform or similar tools
Proficiency in at least one programming language (Python, Go, or Java preferred)
Experience with modern observability practices including distributed tracing, metric aggregation, and log correlation
Strong understanding of reliability engineering principles including SLIs, SLOs, error budgets, and toil reduction
Experience with cloud platforms (AWS, Azure, or GCP) and containerized environments
Knowledge of database systems and their monitoring requirements
Understanding of network protocols and ability to troubleshoot network related issues
Preferred
Master's degree
Experience with chaos engineering practices
Knowledge of machine learning for anomaly detection
Experience with high throughput, low latency systems
Benefits
Short-Term Incentive Plan with the target at 7.5% of your annual earnings, terms and conditions apply.
Medical
Dental
Vision
401k with a 4% employer match
FSA
Paid sick leave
Generous paid time off (PTO) program
Company
Ambry Genetics
Ambry leads in clinical genetic diagnostics and genetics software solutions.
H1B Sponsorship
Ambry Genetics has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (2)
2022 (3)
2021 (6)
2020 (8)
Funding
Current Stage
Late StageTotal Funding
unknown2017-10-19Series Unknown· undefined
2017-07-07Acquired· undefined
Recent News
Modern Healthcare
2024-11-06
2024-05-24
Company data provided by crunchbase