Apply on Employer Site

Anthropic · 6 hours ago

ML/Research Engineer, Safeguards

San Francisco, CA | New York City, NY

Full-time

Hybrid

Mid, Senior Level

$350K/yr - $500K/yr

4+ years exp

Anthropic is a public benefit corporation dedicated to creating reliable and interpretable AI systems. They are seeking ML Engineers and Research Engineers to help detect and mitigate misuse of AI systems, focusing on building systems that identify harmful use and developing defenses to keep their products safe.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning

H1B Sponsored

Responsibilities

Develop classifiers to detect misuse and anomalous behavior at scale. This includes developing synthetic data pipelines for training classifiers and methods to automatically source representative evaluations to iterate on

Build systems to monitor for harms that span multiple exchanges, such as coordinated cyber attacks and influence operations, and develop new methods for aggregating and analyzing signals across contexts

Evaluate and improve the safety of agentic products—developing both threat models and environments to test for agentic risks, and developing and deploying mitigations for prompt injection attacks

Conduct research on automated red-teaming, adversarial robustness, and other research that helps test for or find misuse

Qualification

ML engineeringPythonBuilding classifiersAdversarial machine learningReinforcement learningBehavioral MLLanguage modelingInterpretabilityAnomaly detectionCommunication skills

Required

Have 4+ years of experience in ML engineering, research engineering, or applied research, in academia or industry

Have proficiency in Python and experience building ML systems

Are comfortable working across the research-to-deployment pipeline, from exploratory experiments to production systems

Are worried about misuse risks of AI systems, and want to work to mitigate them

Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders

We require at least a Bachelor's degree in a related field or equivalent experience

Preferred

Language modeling and transformers

Building classifiers, anomaly detection systems, or behavioral ML

Adversarial machine learning or red-teaming

Interpretability or probes

Reinforcement learning

High-performance, large-scale ML systems

Benefits

Equity and benefits

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

A lovely office space in which to collaborate with colleagues

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Lightspeed Venture PartnersGoogleAmazon

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

CEO & Co-Founder

Daniela Amodei

President and co-founder

Recent News

contxto.com

Venture capital in 2025: The year of artificial intelligence and mega-investment rounds

2026-01-14

Longevity.Technology

Function Health launches Claude integration to enhance AI-powered health insights

2026-01-14

The Hacker News

Anthropic Launches Claude AI for Healthcare with Secure Health Record Access

2026-01-14

Company data provided by crunchbase