Research Engineer, Pretraining Scaling jobs in United States
cer-icon
Apply on Employer Site
company-logo

Anthropic · 2 days ago

Research Engineer, Pretraining Scaling

Anthropic is a public benefit corporation focused on creating reliable and beneficial AI systems. The Research Engineer on the ML Performance and Scaling team will ensure the efficient training of production pretrained models, involving responsibilities such as performance optimization, debugging, and collaboration across teams.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
H1B Sponsorednote

Responsibilities

Own critical aspects of our production pretraining pipeline, including model operations, performance optimization, observability, and reliability
Debug and resolve complex issues across the full stack—from hardware errors and networking to training dynamics and evaluation infrastructure
Design and run experiments to improve training efficiency, reduce step time, increase uptime, and enhance model performance
Respond to on-call incidents during model launches, diagnosing problems quickly and coordinating solutions across teams
Build and maintain production logging, monitoring dashboards, and evaluation infrastructure
Add new capabilities to the training codebase, such as long context support or novel architectures
Collaborate closely with teammates across SF and London, as well as with Tokens, Architectures, and Systems teams
Contribute to the team's institutional knowledge by documenting systems, debugging approaches, and lessons learned

Qualification

Large language modelsJAXTPUPyTorchDistributed systemsPerformance optimizationDebuggingExperimental designProduction ML systemsCollaborationCommunicationProblem-solving

Required

At least a Bachelor's degree in a related field or equivalent experience
Hands-on experience training large language models, or deep expertise with JAX, TPU, PyTorch, or large-scale distributed systems
Enjoy both research and engineering work—ideal split as roughly 50/50
Excited about being on-call for production systems, working long days during launches, and solving hard problems under pressure
Thrive when working on whatever is most impactful, even if that changes day-to-day based on what the production model needs
Excel at debugging complex, ambiguous problems across multiple layers of the stack
Communicate clearly and collaborate effectively, especially when coordinating across time zones or during high-stress incidents
Passionate about the work itself and want to refine your craft as a research engineer
Care about the societal impacts of AI and responsible scaling

Preferred

Previous experience training LLM's or working extensively with JAX/TPU, PyTorch, or other ML frameworks at scale
Contributed to open-source LLM frameworks (e.g., open_lm, llm-foundry, mesh-transformer-jax)
Published research on model training, scaling laws, or ML systems
Experience with production ML systems, observability tools, or evaluation infrastructure
Background as a systems engineer, quant, or in other roles requiring both technical depth and operational excellence

Benefits

Equity and benefits
Generous vacation and parental leave
Flexible working hours

Company

Anthropic

twittertwittertwitter
company-logo
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$33.74B
Key Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B

Leadership Team

leader-logo
Dario Amodei
CEO & Co-Founder
linkedin
leader-logo
Daniela Amodei
President and co-founder
linkedin
Company data provided by crunchbase