Apply on Employer Site

Anthropic · 16 hours ago

Research Engineer, Environment Scaling

United States

Full-time

Remote

Mid, Senior Level

$350K/yr - $850K/yr

Anthropic is a public benefit corporation focused on creating reliable and beneficial AI systems. They are seeking a Research Engineer for their Environment Scaling team, which aims to enhance the intelligence of public models by developing training environments for reinforcement learning. The role involves managing vendor relationships, designing reward signals, and collaborating with domain experts to improve model performance.

Artificial Intelligence (AI)Information TechnologyFoundational AIGenerative AIMachine Learning

H1B Sponsored

Responsibilities

Improve and execute our fine-tuning strategies for adapting Claude to new domains and tasks

Manage technical relationships with external data vendors, including evaluation of data quality and reward design

Collaborate with domain experts to design data pipelines and evaluations

Explore novel ways of creating RL environments for high value tasks

Develop and improve QA frameworks to catch reward hacking and ensure environment quality

Partner with other RL research teams and product teams to translate capability goals into training environments and evals

Qualification

Fine-tuning large language modelsReinforcement learningReward designData operationsProject managementTechnical vendor managementDomain expertiseInterpersonal skills

Required

At least a Bachelor's degree in a related field or equivalent experience

Experience with fine-tuning large language models for specific domains or real-world use cases and/or domain expertise in an area where we would like to make our models more useful

Experience with reinforcement learning, reward design, or training data curation for LLMs

Comfortable managing technical vendor relationships and iterating quickly on feedback

Strong project management and interpersonal skills

Passionate about making AI more useful and accessible across different industries

Excited about a role that includes a combination of ML research, data operations, and project management

Preferred

Experience training production ML systems

Familiar with distributed systems and cloud infrastructure

Domain expertise in an area where we would like to make our models more useful

Experience working with external vendors or technical partners

Benefits

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Fidelity,ICONIQ Capital,Lightspeed Venture PartnersLightspeed Venture PartnersGoogle

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

CEO and Co-Founder

Daniela Amodei

President and co-founder

Recent News

Hedgeweek

Man Group partners with Anthropic to embed AI across investment processes

2026-02-12

Benzinga.com

Destiny Tech100 Stock Rises After Adding Anthropic Exposure

2026-02-12

legacy.thefly.com

AI Daily: Blackstone raising stake in Anthropic in current funding round

2026-02-12

Company data provided by crunchbase