Apply on Employer Site

Turing · 17 hours ago

Head of Data Quality - RL Gyms

San Francisco, California, United States

Full-time

Hybrid

Director/Executive

7+ years exp

Turing is a leading research accelerator for frontier AI labs based in San Francisco, California, seeking a Head of Data Quality for RL Environments. This role involves building and leading the quality function for reinforcement learning environment data, managing a team, and ensuring high standards for data quality in AI systems.

Artificial Intelligence (AI)Generative AIInformation TechnologyMachine LearningSoftware Engineering

H1B Sponsor Likely

Responsibilities

Own the RL Environment Data Quality Vision & Strategy

Lead & Develop Data Quality Leads

Design Research-Grade Evaluation & Quality Systems for RL Environments

Translate AI & RL Research Trends into Environment and Data Requirements

Partner Across Operations, Product, and Customers

Build Tools, Processes, and Documentation

Qualification

Reinforcement Learning (RL)Machine Learning (ML)PythonData Quality PrinciplesExperimental DesignHuman Evaluation ProcessesSimulation FrameworksBenchmark DesignLeadershipCommunication

Required

Bachelor's degree in Computer Science, Mathematics, Engineering, or a related field; or equivalent practical experience

Strong technical background, including experience with Python as a primary language

Experience with RL or simulation frameworks (e.g., OpenAI Gym / Gymnasium–style APIs, custom simulators, or game engines)

7+ years total experience in software engineering, ML/AI, RL, simulation, or related fields

3+ years managing technical teams (e.g., research, data science, RL / simulation, data quality, or engineering)

Hands-on experience with ML/AI systems, with a strong preference for RL, RLHF/RLAIF, or agent-like systems (tool-using, web, or embodied agents)

Environment or benchmark design, or large-scale agent evaluation

Prior exposure to data annotation / human feedback / human evaluation processes, including designing rubrics and tasks for human raters

Working with preference data or trajectory labeling

High-level understanding of modern GenAI and RL / agents trends, such as LLM-based agents interacting with tools or environments

Reward shaping, curriculum learning, and preference modeling

Safety, alignment, and robustness for agents in complex environments

Strong grasp of data and environment quality principles: Environment correctness, coverage, and diversity

Reward design pitfalls and reward hacking detection

Human evaluation quality, calibration, and inter-rater reliability

Ability to read ML/RL/AI research papers and translate them into new environment or task requirements

Evaluation and benchmarking strategies

Concrete annotation and quality-control workflows

Excellent communication and leadership skills; comfortable setting direction and making tradeoff decisions in ambiguous, fast-changing domains

Preferred

Graduate degree (MS/PhD) in Computer Science, Machine Learning, Robotics, or related field

Experience working in or closely with a research lab or frontier AI organization focused on RL, agents, or aligned systems

Direct experience with designing RL benchmarks, simulators, or environment suites

RLHF/RLAIF pipelines or large-scale human feedback collection

Multi-agent or multi-task environments

Familiarity with game engines or simulation platforms (e.g., Unity, Unreal, MuJoCo, Isaac, Habitat, or similar)

Background in statistics and experimental design, especially for human feedback experiments

A/B testing of environment or reward variants

Experience in high-growth startup or similarly dynamic environments

Benefits

Amazing work culture (Super collaborative & supportive work environment; 5 days a week)

Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)

Competitive compensation

Flexible working hours

Company

Turing

Glassdoor3.9

Turing advances frontier AI and builds real-world systems for Fortune 500 companies, governments, and the world’s leading AI labs.

Founded in 2018

Palo Alto, California, USA

1001-5000 employees

https://www.turing.com

H1B Sponsorship

Turing has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (16)

2024 (8)

2023 (7)

2022 (16)

2021 (6)

Funding

Current Stage

Late Stage

Total Funding

$270.19M

Key Investors

Khazanah NasionalAltaIR CapitalWestBridge Capital

2025-03-06Series E· $111M

2021-12-07Convertible Note· $6.85M

2021-10-04Series D· $87M

Leadership Team

Jonathan Siddharth

Founder & CEO

Vijay Krishnan

Founder & CTO

Recent News

Foundation Capital

Foundation Capital Portfolio

2025-12-31

BRIDGE

Turing, Developer of Fully Autonomous Driving Systems, Raises ¥15.27 Billion in Series A

2025-11-23

EIN Presswire

Former Google VP Catherine Lacavera Joins Toborlife AI Board of Directors

2025-11-22

Company data provided by crunchbase