Babylist · 18 hours ago
Staff Software Engineer, Site Reliability
Maximize your interview chances
BabyE-Commerce
Growth OpportunitiesH1B Sponsor Likely
Insider Connection @Babylist
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Manage and build our AWS infrastructure using Infrastructure as Code (IaC) tools like Terraform. You will ensure that our EKS clusters and databases are running up-to-date versions, optimizing performance and reliability
Improve the speed and reliability of our Continuous Integration (CI) systems to support the entire Engineering Team, enabling faster and more efficient development and deployment processes
Provide support to developers in troubleshooting issues across local development, staging, and production environments
Establish, communicate, and support best practices for monitoring and alerting. This will involve setting up effective monitoring systems and defining actionable alerts for proactive incident management
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
6+ years of experience as a Site Reliability Engineer or similar role, demonstrating a strong background in maintaining highly available and scalable systems
Experience supporting high-traffic consumer-facing websites, understanding the unique challenges and considerations in maintaining such systems
Proficiency with Terraform is a must, as you will be a member of the team responsible for managing and building our AWS infrastructure using Infrastructure as Code (IaC) practices
You possess strong experience working with AWS cloud-based infrastructure and services, ensuring their reliability, performance, and security
Proficiency with Docker and Kubernetes is essential, as you will contribute to the design, deployment, and management of containerized applications in our environment
You have a solid understanding of cloud-native systems design, including CDNs, load balancers, cloud networking, DNS, caching, and distributed systems
Troubleshooting and debugging are second nature to you, allowing you to quickly identify and resolve issues across various environments
Experience designing and supporting CI systems such as CircleCI, Jenkins, or GitHub Actions
You are familiar with monitoring and alerting best practices, utilizing tools like Datadog, Cronitor, Sentry, and PagerDuty to ensure proactive identification and resolution of issues
Proven experience in on-call management best practices, including effective incident response, escalation procedures, and post-incident reviews to drive continuous improvement and ensure system reliability
You have excellent verbal and written communication skills, and the ability to collaborate effectively with cross-functional teams
Benefits
Company paid medical, dental, and vision
A generous paid parental leave policy
401k with company match
Flexible spending account
Paid leave (including PTO and parental leave)
Company
Babylist
Babylist is the leading marketplace and commerce destination for baby
H1B Sponsorship
Babylist has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (1)
2021 (2)
Funding
Current Stage
Growth StageTotal Funding
$40.65MKey Investors
Norwest Venture Partners
2021-11-04Series C· $40M
2018-02-15Series Unknown· undefined
2013-06-25Seed· $0.65M
Recent News
2024-11-19
2023-08-18
Company data provided by crunchbase