ML/Research Engineer, Safeguards jobs in United States
cer-icon
Apply on Employer Site
company-logo

Anthropic · 6 hours ago

ML/Research Engineer, Safeguards

Anthropic is a public benefit corporation dedicated to creating reliable and interpretable AI systems. They are seeking ML Engineers and Research Engineers to help detect and mitigate misuse of AI systems, focusing on building systems that identify harmful use and developing defenses to keep their products safe.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
H1B Sponsorednote

Responsibilities

Develop classifiers to detect misuse and anomalous behavior at scale. This includes developing synthetic data pipelines for training classifiers and methods to automatically source representative evaluations to iterate on
Build systems to monitor for harms that span multiple exchanges, such as coordinated cyber attacks and influence operations, and develop new methods for aggregating and analyzing signals across contexts
Evaluate and improve the safety of agentic products—developing both threat models and environments to test for agentic risks, and developing and deploying mitigations for prompt injection attacks
Conduct research on automated red-teaming, adversarial robustness, and other research that helps test for or find misuse

Qualification

ML engineeringPythonBuilding classifiersAdversarial machine learningReinforcement learningBehavioral MLLanguage modelingInterpretabilityAnomaly detectionCommunication skills

Required

Have 4+ years of experience in ML engineering, research engineering, or applied research, in academia or industry
Have proficiency in Python and experience building ML systems
Are comfortable working across the research-to-deployment pipeline, from exploratory experiments to production systems
Are worried about misuse risks of AI systems, and want to work to mitigate them
Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
We require at least a Bachelor's degree in a related field or equivalent experience

Preferred

Language modeling and transformers
Building classifiers, anomaly detection systems, or behavioral ML
Adversarial machine learning or red-teaming
Interpretability or probes
Reinforcement learning
High-performance, large-scale ML systems

Benefits

Equity and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
A lovely office space in which to collaborate with colleagues

Company

Anthropic

twittertwittertwitter
company-logo
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$33.74B
Key Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B

Leadership Team

leader-logo
Dario Amodei
CEO & Co-Founder
linkedin
leader-logo
Daniela Amodei
President and co-founder
linkedin
Company data provided by crunchbase