LinkedIn · 4 months ago
Distinguished Software Engineer, Reliability Infra
LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. The Distinguished Software Engineer, Reliability Infra will serve as a senior technical leader, driving the reliability and observability strategy across LinkedIn’s infrastructure while mentoring engineers and leading incident response efforts.
Professional NetworkingRecruitingSocial MediaSocial Recruiting
Responsibilities
Serve as a senior technical leader driving the long-term reliability and observability strategy across LinkedIn’s infrastructure
Re-architect LinkedIn’s backend systems to enable granular failure domains and reduce the blast radius of incidents
Design and implement next-generation failure mitigation strategies that avoid full-region or full-datacenter failovers
Partner closely with across many different types of engineers to raise the bar for operational excellence and incident response
Define and build frameworks to improve monitoring, alerting, and observability across hundreds of services and systems
Define and own the roadmap of bringing observability to critical user journeys for LinkedIn’s products to help capture and improve the experience of LinkedIn’s members/customers
Spearhead a multi-year initiative to transition LinkedIn’s infrastructure to a regionalized model with localized failover, enhancing both scalability and availability
Lead technical discussions on the future of Engineering at LinkedIn, what the function should evolve into over the next 3–5 years
Deliver key insights, executive level reporting across the cross-functional engineering teams to enable the right business decisions around improving quality and reliability of our services and products
Act as a force multiplier by mentoring engineers, influencing technical direction across orgs, and contributing deeply to culture, hiring, and technical excellence
Lead incident response and post-incident reviews to identify root causes and implement preventive measures. Develop and maintain incident management processes and procedures to ensure timely resolution of issues and minimize impact on customers
Qualification
Required
15+ years of software engineering experience
8+ years focused on infrastructure, reliability focused engineering, or distributed systems
Preferred
Hands-on experience with large-scale incident response, root cause analysis, and resiliency engineering
Strong communication and cross-functional collaboration skills, with experience influencing across multiple orgs and leadership levels
Proven success designing and leading architectural transformations at internet-scale companies
Deep knowledge of systems reliability, observability frameworks, and fault-tolerant architecture design
Experience with multi-region architecture, capacity planning, and failover strategies in large-scale cloud or hybrid environments
Background in CI/CD, platform reliability, and automation of ops-heavy systems
Familiarity with modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana) and service mesh architecture
Track record of setting long-term technical strategy and driving systemic improvements in availability and performance
Previous experience in a Distinguished Engineer or equivalent role at a high-growth or web-scale technology company
Benefits
Annual performance bonus
Stock
Benefits and/or other applicable incentive compensation plans
Company
LinkedIn is a professional networking site that allows users to create business connections, search for jobs, and find potential clients. It is a sub-organization of Microsoft.
H1B Sponsorship
LinkedIn has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (787)
2024 (1108)
2023 (913)
2022 (1580)
2021 (1043)
2020 (1146)
Funding
Current Stage
Public CompanyTotal Funding
$154.8MKey Investors
Bain Capital VenturesGreylockSequoia Capital
2016-06-13Acquired
2016-02-15Private Equity
2014-04-01Series Unknown
Recent News
2025-12-24
Social Media Today
2025-12-24
Social Media Today
2025-12-24
Company data provided by crunchbase