Research Scientist, Interpretability jobs in United States
cer-icon
Apply on Employer Site
company-logo

Anthropic · 1 day ago

Research Scientist, Interpretability

Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking a Research Scientist to join their Interpretability team, which works on reverse-engineering how trained models function to ensure their safety and reliability.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
H1B Sponsorednote

Responsibilities

Develop methods for understanding LLMs by reverse engineering algorithms learned in their weights
Design and run robust experiments, both quickly in toy scenarios and at scale in large models
Create and analyze new interpretability features and circuits to better understand how models work
Build infrastructure for running experiments and visualizing results
Work with colleagues to communicate results internally and publicly

Qualification

Mechanistic InterpretabilityPythonScientific ResearchExperimental DesignTeam CollaborationCommunication Skills

Required

Have a strong track record of scientific research (in any field), and have done some work on Interpretability
Enjoy team science – working collaboratively to make big discoveries
Are comfortable with messy experimental science. We're inventing the field as we work, and the first textbook is years away
You view research and engineering as two sides of the same coin. Every team member writes code, designs and runs experiments, and interprets results
You can clearly articulate and discuss the motivations behind your work, and teach us about what you've learned. You like writing up and communicating your results, even when they're null
Familiarity with Python is required for this role
Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience

Benefits

Equity and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours

Company

Anthropic

twittertwittertwitter
company-logo
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$33.74B
Key Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B

Leadership Team

leader-logo
Dario Amodei
CEO & Co-Founder
linkedin
leader-logo
Daniela Amodei
President and co-founder
linkedin
Company data provided by crunchbase