AI Research Engineer, Enterprise Evaluations jobs in United States
cer-icon
Apply on Employer Site
company-logo

Scale AI · 8 hours ago

AI Research Engineer, Enterprise Evaluations

Scale AI is seeking a technically rigorous and driven AI Research Engineer to join their Enterprise Evaluations team. This high-impact role focuses on developing and maintaining AI evaluation systems that ensure safety and reliability in LLM-powered workflows for enterprise clients.

AI InfrastructureArtificial Intelligence (AI)Data Collection and LabelingGenerative AIImage RecognitionMachine Learning
check
H1B Sponsor Likelynote

Responsibilities

Partner with Scale’s Operations team and enterprise customers to translate ambiguity into structured evaluation data, guiding the creation and maintenance of gold-standard human-rated datasets and expert rubrics that anchor AI evaluation systems
Analyze feedback and collected data to identify patterns, refine evaluation frameworks, and establish iterative improvement loops that enhance the quality and relevance of human-curated assessments
Design, research, and develop LLM-as-a-Judge autorater frameworks and AI-assisted evaluation systems. This includes creating models that critique, grade, and explain agent outputs (e.g., RLAIF, model-judging-model setups), along with scalable evaluation pipelines and diagnostic tools
Pursue research initiatives that explore new methodologies for automatically analyzing, evaluating, and improving the behavior of enterprise agents, pushing the boundaries of how AI systems are assessed and optimized in real-world contexts

Qualification

Large Language ModelsMachine LearningGenerative AIPythonML frameworksStatistical analysisModel evaluation methodologiesResearch skillsCollaborationProblem-solving

Required

Bachelor's degree in Computer Science, Electrical Engineering, a related field, or equivalent practical experience
2+ years of experience in Machine Learning or Applied Research, focused on applied ML systems or evaluation infrastructure
Hands-on experience with Large Language Models (LLMs) and Generative AI in professional or research environments
Strong understanding of frontier model evaluation methodologies and the current research landscape
Proficiency in Python and major ML frameworks (e.g., PyTorch, TensorFlow)
Solid engineering and statistical analysis foundation, with experience developing data-driven methods for assessing model quality

Preferred

Advanced degree (Master's or Ph.D.) in Computer Science, Machine Learning, or a related quantitative field
Published research in leading ML or AI conferences such as NeurIPS, ICML, ICLR, or KDD
Experience designing, building, or deploying LLM-as-a-Judge frameworks or other automated evaluation systems for complex models
Experience collaborating with operations or external teams to define high-quality human annotator guidelines
Expertise in ML research engineering, stochastic systems, observability, or LLM-powered applications for model evaluation and analysis
Experience contributing to scalable pipelines that automate the evaluation and monitoring of large-scale models and agents
Familiarity with distributed computing frameworks and modern cloud infrastructure

Benefits

Comprehensive health, dental and vision coverage
Retirement benefits
A learning and development stipend
Generous PTO
Commuter stipend

Company

Scale AI

twittertwittertwitter
company-logo
Scale’s mission is to develop reliable AI systems for the world’s most important decisions.

H1B Sponsorship

Scale AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (82)
2024 (54)
2023 (29)
2022 (17)
2021 (10)
2020 (10)

Funding

Current Stage
Late Stage
Total Funding
$15.9B
Key Investors
MetaAccelTiger Global Management
2025-06-10Corporate Round· $14.3B
2025-06-04Series Unknown
2024-05-21Series F· $1B

Leadership Team

leader-logo
Jason Droege
Interim Chief Executive Officer
linkedin
leader-logo
Dennis Cinelli
Chief Financial Officer
linkedin
Company data provided by crunchbase