Staff Software Engineer, Robinhood Command Center jobs in United States
cer-icon
Apply on Employer Site
company-logo

Robinhood · 12 hours ago

Staff Software Engineer, Robinhood Command Center

Robinhood is a company focused on democratizing finance for all, and they are seeking a Staff Reliability Engineer to join their newly formed Robinhood Command Center team. This role will involve leading incident mitigation efforts, developing incident management processes, and driving operational excellence across Robinhood’s infrastructure.

CryptocurrencyFinTechPrediction MarketsStock ExchangesTrading Platform

Responsibilities

Serve as a senior technical leader driving the long-term reliability and observability strategy across Robinhood’s infrastructure
Partner closely across many different types of engineers to raise the bar for operational excellence and incident response
Lead incident mitigation efforts by coordinating service owners, facilitating time-sensitive decisions like rollbacks, traffic shifts, and maintaining a clear source of truth during active incidents
Develop and maintain incident management processes and procedures to ensure timely resolution and minimize customer impact
Own incident discovery at the company level by defining and maintaining global dashboards and alerts tied to critical user journeys (CUJs), availability, and business-impact metrics
Own and evolve incident response tooling and processes, including education, adoption, and measurement of MTTD/MTTR improvements
Drive post-incident governance and learning, defining standards for postmortems, SEV reviews, and follow-up tracking to ensure durable reliability improvements
Design and implement next-generation failure mitigation strategies that avoid full-region or full-datacenter failovers
Define and build frameworks to improve monitoring, alerting, and observability across hundreds of services and systems
Define and own the roadmap of bringing observability to critical user journeys for Robinhood’s products
Deliver key insights and executive-level reporting to enable better business decisions around service quality and reliability
Act as a force multiplier through mentoring, technical influence, and contributions to hiring and engineering culture

Qualification

Reliability engineeringIncident leadershipObservability frameworksDistributed systemsProduction operationsCapacity planningFault-tolerant architectureTechnical influencePost-incident governanceExecutive-level reportingHigh-severity incident managementMeasurable improvementsMulti-region architecturesCross-functional collaborationCommunicationMentoringModern observability stacks

Required

8+ years of software engineering experience, including significant experience operating production systems
4+ years focused on reliability engineering, infrastructure, distributed systems, or production operations
Hands-on experience serving in incident leadership roles (e.g., IMOC, incident commander, primary oncall)
Strong communication and cross-functional collaboration skills, especially during high-severity incidents
Deep knowledge of systems reliability, observability frameworks, and fault-tolerant architecture design
Experience with multi-region or multi-cluster architectures, capacity planning, and failover strategies
Familiarity with modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana)
Demonstrated ability to drive measurable improvements in MTTD, MTTR, availability, or customer impact

Benefits

Performance driven compensation with multipliers for outsized impact, bonus programs, equity ownership, and 401(k) matching
Best in class benefits to fuel your work, including 100% paid health insurance for employees with 90% coverage for dependents
Lifestyle wallet - a highly flexible benefits spending account for wellness, learning, and more
Employer-paid life & disability insurance, fertility benefits, and mental health benefits
Time off to recharge including company holidays, paid time off, sick time, parental leave, and more!
Exceptional office experience with catered meals, events, and comfortable workspaces

Company

Robinhood

company-logo
Robinhood is a stock brokerage that allows customers to buy and sell stocks, options, ETFs, and cryptocurrencies with zero commission.

Funding

Current Stage
Public Company
Total Funding
$6.23B
Key Investors
Emergent Fidelity TechnologiesRibbit CapitalD1 Capital Partners
2022-05-13Post Ipo Secondary· $648.29M
2021-07-29IPO
2021-02-01Private Equity· $2.4B

Leadership Team

leader-logo
Vlad Tenev
Co-Founder, Chairman, CEO
linkedin
leader-logo
Kamal Boparai
Engineering Manager, Compliance & Legal Systems + ServiceNow
linkedin
Company data provided by crunchbase