Apply on Employer Site

Anthropic · 16 hours ago

Technical Program Manager, Safeguards – Infrastructure & Evals

San Francisco, CA

Full-time

Hybrid

Senior Level

$290K/yr - $365K/yr

Anthropic is a public benefit corporation focused on creating safe and beneficial AI systems. As a Technical Program Manager for Safeguards Infrastructure and Evals, you will oversee the operational health of AI safety systems, drive incident response processes, and coordinate infrastructure improvements to ensure reliability in production environments.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning

H1B Sponsored

Responsibilities

Own the Safeguards Engineering ops review - Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures, bringing visibility to reliability trends, and making sure the right people are in the room when decisions need to be made. This is the heartbeat of how Safeguards Eng stays ahead of operational risk

Drive incident tracking and post-mortem execution - When incidents happen — and in this space, they happen regularly — you'll make sure they get followed through properly. That means tracking incidents across the organization (including those owned by partner teams like Inference), ensuring post-mortems get written, and most critically, making sure the action items that come out of them actually get done. Closing the loop on post-mortem actions is one of the highest-leverage things this role does

Establish and maintain SLOs with partner teams - Work with Safeguards Engineering teams and key partners — particularly Inference and Cloud Inference — to define service-level objectives for safety-critical pipelines. Then build the tracking and reporting that makes it possible to tell whether those SLOs are being met, and surface it when they're not

Maintain runbook quality and incident-ownership clarity - Safety-critical systems need clear playbooks for when things go wrong. Partner with engineering leads to keep runbooks accurate, actionable, and up to date — and ensure that ownership of incidents (including for areas like account-banning false positives and CSAM detection) is unambiguous so that nothing falls through the cracks during an active incident

Drive platform migrations and infrastructure projects - Own the program management for the larger infrastructure work on the roadmap: migrating the infra from one platform to the next, moving from one incident platform to the next and from one cloud system monitoring to another, and other migrations as they come. These are cross-team efforts with real dependencies — your job is to keep them sequenced, on track, and connected to the teams that need them

Coordinate evals platform improvements - Partner with the evals engineering team to drive improvements to the evaluation platform — including self-serve capabilities and the broader eval factory infrastructure. Help scope the work, track dependencies on other Safeguards systems, and make sure the evals platform is keeping pace with the team's needs

Qualification

Technical Program ManagementIncident ManagementSLO DefinitionAI SafetyInfrastructure MigrationMonitoring ToolsOperational CadencesCross-team CoordinationCommunication Skills

Required

At least a Bachelor's degree in a related field or equivalent experience

Solid technical program management experience, particularly in operational or infrastructure-heavy environments

Understanding of how production ML systems work well enough to triage incidents intelligently

Ability to close loops on post-mortem action items, SLOs, and runbooks

Effective coordination across team boundaries

Ability to thrive in environments where work shifts between 'keep the lights on' and 'build something new.'

Experience with or strong interest in AI safety

Preferred

Experience with SRE practices, incident management frameworks, or on-call operations at scale

Experience working on or with evaluation infrastructure for ML systems

Experience driving infrastructure migrations in complex, multi-team environments

Familiarity with monitoring and alerting tooling (PagerDuty, Datadog, or equivalents) and the operational culture around them

Benefits

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Fidelity,ICONIQ Capital,Lightspeed Venture PartnersLightspeed Venture PartnersGoogle

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

Co-Founder and CEO

Daniela Amodei

President and co-founder

Recent News

Decrypt

Anthropic Trolls OpenAI's ChatGPT in Super Bowl Ad Campaign

2026-02-05

Inc42 Media

Inside India’s Long AI Game

2026-02-05

Benzinga.com

Software 'SaaSpocalypse:' BTIG Sees Salesforce, ServiceNow Rebound, But Jim Cramer Warns Of Permanent AI Obsolescence

2026-02-05

Company data provided by crunchbase