Apply on Employer Site

Anthropic · 2 hours ago

Applied Safety Research Engineer, Safeguards

San Francisco, CA

Full-time

Hybrid

Mid, Senior Level

$320K/yr - $405K/yr

4+ years exp

Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking an Applied Safety Research Engineer to develop methods for improving safety evaluations of AI models, ensuring that evaluations reflect real-world usage and enhance model safety across various factors.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning

H1B Sponsored

Responsibilities

Design and run experiments to improve evaluation quality—developing methods to generate representative test data, simulate realistic user behavior, and validate grading accuracy

Research how different factors (multi-turn conversations, tools, long context, user diversity) impact model safety behavior

Analyze evaluation coverage to identify gaps and inform where we need better measurement

Productionize successful research into evaluation pipelines that run during model training, launch and beyond

Collaborate with Policy and Enforcement to translate real-world harm patterns into measurable evaluations

Build tooling that enables policy experts to create and iterate on evaluations

Surface findings to research and training teams to drive upstream model improvements

Qualification

PythonML engineeringData pipelinesData analysisLLMs experiencePrototypingAI safetyRed teamingTrustSafetySynthetic data generationDistributed systemsPrompt engineering

Required

Have 4+ years of software engineering or ML engineering experience

Are proficient in Python and comfortable working across the stack

Have experience building and maintaining data pipelines

Are comfortable with data analysis and can draw insights from large datasets

Have experience with LLMs and understand their capabilities and failure modes

Can move fluidly between prototyping and production-quality code

Are excited by ambiguous problems and can translate them into concrete experiments

Care deeply about AI safety and want your work to have real impact

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience

Preferred

Red teaming, adversarial testing, or jailbreak research on AI systems

Building or contributing to LLM evaluation frameworks or benchmarks

Trust and safety, content moderation, or abuse detection systems

Synthetic data generation or data augmentation

Distributed systems or large-scale data processing

Prompt engineering or LLM application development

Benefits

Equity and benefits

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

A lovely office space in which to collaborate with colleagues

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Lightspeed Venture PartnersGoogleAmazon

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

CEO & Co-Founder

Daniela Amodei

President and co-founder

Recent News

PitchBook

Surge in mega-deals vaults US to top of global VC per capita list

2026-01-11

Insurance giant Allianz signs Claude Code deal with Anthropic | CIO

Insurance giant Allianz signs Claude Code deal with Anthropic

2026-01-11

Venturebeat

Anthropic cracks down on unauthorized Claude usage by third-party harnesses and rivals

2026-01-11

Company data provided by crunchbase