AlphaSense · 2 days ago
Sr. Site Reliability Engineer
Wonder how qualified you are to the job?
AnalyticsArtificial Intelligence (AI)
Insider Connection @AlphaSense
Responsibilities
Elevate product reliability to achieve 99.99% uptime and improve existing systems and processes.
Engage with engineering teams to understand product requirements and enhance software application build/test/deploy processes.
Participate in an on-call rotation to address availability incidents and support application engineers during customer incidents.
Thoroughly document actions, transform findings into repeatable processes, and automate processes.
Debug production issues across various services and stack levels.
Enhance system performance through performance testing, implement monitoring solutions, and strategize for system scalability.
Oversee release engineering management, coordinate software releases, and ensure efficient and reliable software delivery.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Strong experience in Kubernetes, Helm, Prometheus, Fluentd, Grafana, and other Cloud-Native solutions.
Able to identify opportunities to improve system’s reliability such as utilization, scalability, efficiency and drive the implementation.
Your adeptness in operating systems, networks, or hardware empowers you to quickly diagnose complex issues and identify critical system bottlenecks.
Mastered the craft of automation for mitigating toil and expert in troubleshooting complex system issues impacting reliability.
Know how complex distributed systems fail and look for ways to protect the software and system.
Can navigate through full stack application and build proficiency on the right tools to dig deep into the system issues.
You have strong experience in building effective processes for continuous improvements of Service Level Objectives (SLOs), mean time to identification (MTTI), mean time to resolution (MTTR), and mean time to failure (MTTF)
You can proactively identify issues with technical dependencies of projects that are owned by other teams and surface them
You are able to decompose reliability problems or business scenarios into solutions composed of multiple software or systems components interacting with each other.
Proficient in enhancing system design and architecture by recognizing key risk areas (failure domains) and strategically prioritizing improvements based on discernment, distinguishing between aspects that can be overlooked and those that yield maximum benefits for the company.
You are a true partner to stakeholders, executing against the spirit, not just the letter, of requirements.
You possess the capability to take ownership of creating documentation at the RFC (Request for Comments) level, which necessitates a high degree of collaboration.
Experience in one or more of the following: Java, Go, NodeJS, React, Python
Preferred
Experience working with public cloud providers - AWS/GCP
Experience working with on-call Incident Response solutions
Benefits
Performance-based bonus
Equity
Generous benefits program
Company
AlphaSense
AlphaSense is a intelligence platform that uses artificial intelligence allowing professionals to make critical decisions. It is a sub-organization of AlphaSense.
H1B Sponsorship
AlphaSense has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Trends of Total Sponsorships
2023 (3)
2022 (3)
2021 (3)
2020 (4)
Funding
Current Stage
Late StageTotal Funding
$770.05MKey Investors
BondCapitalGBlackRock
2023-09-28Series E· $150M
2023-04-11Series D· $100M
2022-06-15Series D· $225M
Recent News
2023-10-05
2023-10-02
Company data provided by crunchbase