Anthropic · 2 hours ago
Applied Safety Research Engineer, Safeguards
Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking an Applied Safety Research Engineer to develop methods for improving safety evaluations of AI models, ensuring that evaluations reflect real-world usage and enhance model safety across various factors.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Design and run experiments to improve evaluation quality—developing methods to generate representative test data, simulate realistic user behavior, and validate grading accuracy
Research how different factors (multi-turn conversations, tools, long context, user diversity) impact model safety behavior
Analyze evaluation coverage to identify gaps and inform where we need better measurement
Productionize successful research into evaluation pipelines that run during model training, launch and beyond
Collaborate with Policy and Enforcement to translate real-world harm patterns into measurable evaluations
Build tooling that enables policy experts to create and iterate on evaluations
Surface findings to research and training teams to drive upstream model improvements
Qualification
Required
Have 4+ years of software engineering or ML engineering experience
Are proficient in Python and comfortable working across the stack
Have experience building and maintaining data pipelines
Are comfortable with data analysis and can draw insights from large datasets
Have experience with LLMs and understand their capabilities and failure modes
Can move fluidly between prototyping and production-quality code
Are excited by ambiguous problems and can translate them into concrete experiments
Care deeply about AI safety and want your work to have real impact
Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience
Preferred
Red teaming, adversarial testing, or jailbreak research on AI systems
Building or contributing to LLM evaluation frameworks or benchmarks
Trust and safety, content moderation, or abuse detection systems
Synthetic data generation or data augmentation
Distributed systems or large-scale data processing
Prompt engineering or LLM application development
Benefits
Equity and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
A lovely office space in which to collaborate with colleagues
Company
Anthropic
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.
H1B Sponsorship
Anthropic has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)
Funding
Current Stage
Late StageTotal Funding
$33.74BKey Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B
Recent News
2026-01-11
Insurance giant Allianz signs Claude Code deal with Anthropic | CIO
2026-01-11
Company data provided by crunchbase