Anthropic · 6 hours ago
ML/Research Engineer, Safeguards
Anthropic is a public benefit corporation dedicated to creating reliable and interpretable AI systems. They are seeking ML Engineers and Research Engineers to help detect and mitigate misuse of AI systems, focusing on building systems that identify harmful use and developing defenses to keep their products safe.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Develop classifiers to detect misuse and anomalous behavior at scale. This includes developing synthetic data pipelines for training classifiers and methods to automatically source representative evaluations to iterate on
Build systems to monitor for harms that span multiple exchanges, such as coordinated cyber attacks and influence operations, and develop new methods for aggregating and analyzing signals across contexts
Evaluate and improve the safety of agentic products—developing both threat models and environments to test for agentic risks, and developing and deploying mitigations for prompt injection attacks
Conduct research on automated red-teaming, adversarial robustness, and other research that helps test for or find misuse
Qualification
Required
Have 4+ years of experience in ML engineering, research engineering, or applied research, in academia or industry
Have proficiency in Python and experience building ML systems
Are comfortable working across the research-to-deployment pipeline, from exploratory experiments to production systems
Are worried about misuse risks of AI systems, and want to work to mitigate them
Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
We require at least a Bachelor's degree in a related field or equivalent experience
Preferred
Language modeling and transformers
Building classifiers, anomaly detection systems, or behavioral ML
Adversarial machine learning or red-teaming
Interpretability or probes
Reinforcement learning
High-performance, large-scale ML systems
Benefits
Equity and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
A lovely office space in which to collaborate with colleagues
Company
Anthropic
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.
H1B Sponsorship
Anthropic has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)
Funding
Current Stage
Late StageTotal Funding
$33.74BKey Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B
Recent News
Longevity.Technology
2026-01-14
2026-01-14
Company data provided by crunchbase