Apply on Employer Site

Anthropic · 1 day ago

Senior Research Scientist, Reward Models

United States

Full-time

Remote

Senior Level

$340K/yr - $425K/yr

Anthropic is a public benefit corporation dedicated to creating reliable and beneficial AI systems. They are seeking a Senior Research Scientist to lead research on improving how human preferences are specified and learned at scale, focusing on reward modeling for large language models.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning

H1B Sponsored

Responsibilities

Lead research on novel reward model architectures and training approaches for RLHF

Develop and evaluate LLM-based grading and evaluation methods, including rubric-driven approaches that improve consistency and interpretability

Research techniques to detect, characterize, and mitigate reward hacking and specification gaming

Design experiments to understand reward model generalization, robustness, and failure modes

Collaborate with the Finetuning team to translate research insights into improvements for production training pipelines

Contribute to research publications, blog posts, and internal documentation

Mentor other researchers and help build institutional knowledge around reward modeling

Qualification

Reward modelingRLHFLarge language modelsExperiment designCollaborative researchInterpretability techniquesCommunication skillsMentoring

Required

Have a track record of research contributions in reward modeling, RLHF, or closely related areas of machine learning

Have experience training and evaluating reward models for large language models

Are comfortable designing and running large-scale experiments with significant computational resources

Can work effectively across research and engineering, iterating quickly while maintaining scientific rigor

Enjoy collaborative research and can communicate complex ideas clearly to diverse audiences

Care deeply about building AI systems that are both highly capable and safe

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience

Preferred

Have published research on reward modeling, preference learning, or RLHF

Have experience with LLM-as-judge approaches, including calibration and reliability challenges

Have worked on reward hacking, specification gaming, or related robustness problems

Have experience with constitutional AI, debate, or other scalable oversight approaches

Have contributed to production ML systems at scale

Have familiarity with interpretability techniques as applied to understanding reward model behavior

Benefits

Equity

Benefits

Incentive compensation

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Lightspeed Venture PartnersGoogleAmazon

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

CEO & Co-Founder

Daniela Amodei

President and co-founder

Recent News

IndiaTimes

Anthropic President Daniela Amodei calls the AI concept that OpenAI and other companies are pouring billions in 'outdated'

2026-01-08

IndiaTimes

CEO Jensen Huang confirms, Nvidia’s $500 billion AI demand outlook won't …

2026-01-08

PYMNTS.com

Anthropic Aims to Nearly Double Valuation in New Funding Round

2026-01-08

Company data provided by crunchbase