Head of Data Quality - RL Gyms jobs in United States
cer-icon
Apply on Employer Site
company-logo

Turing · 17 hours ago

Head of Data Quality - RL Gyms

Turing is a leading research accelerator for frontier AI labs based in San Francisco, California, seeking a Head of Data Quality for RL Environments. This role involves building and leading the quality function for reinforcement learning environment data, managing a team, and ensuring high standards for data quality in AI systems.

Artificial Intelligence (AI)Generative AIInformation TechnologyMachine LearningSoftware Engineering
check
H1B Sponsor Likelynote

Responsibilities

Own the RL Environment Data Quality Vision & Strategy
Lead & Develop Data Quality Leads
Design Research-Grade Evaluation & Quality Systems for RL Environments
Translate AI & RL Research Trends into Environment and Data Requirements
Partner Across Operations, Product, and Customers
Build Tools, Processes, and Documentation

Qualification

Reinforcement Learning (RL)Machine Learning (ML)PythonData Quality PrinciplesExperimental DesignHuman Evaluation ProcessesSimulation FrameworksBenchmark DesignLeadershipCommunication

Required

Bachelor's degree in Computer Science, Mathematics, Engineering, or a related field; or equivalent practical experience
Strong technical background, including experience with Python as a primary language
Experience with RL or simulation frameworks (e.g., OpenAI Gym / Gymnasium–style APIs, custom simulators, or game engines)
7+ years total experience in software engineering, ML/AI, RL, simulation, or related fields
3+ years managing technical teams (e.g., research, data science, RL / simulation, data quality, or engineering)
Hands-on experience with ML/AI systems, with a strong preference for RL, RLHF/RLAIF, or agent-like systems (tool-using, web, or embodied agents)
Environment or benchmark design, or large-scale agent evaluation
Prior exposure to data annotation / human feedback / human evaluation processes, including designing rubrics and tasks for human raters
Working with preference data or trajectory labeling
High-level understanding of modern GenAI and RL / agents trends, such as LLM-based agents interacting with tools or environments
Reward shaping, curriculum learning, and preference modeling
Safety, alignment, and robustness for agents in complex environments
Strong grasp of data and environment quality principles: Environment correctness, coverage, and diversity
Reward design pitfalls and reward hacking detection
Human evaluation quality, calibration, and inter-rater reliability
Ability to read ML/RL/AI research papers and translate them into new environment or task requirements
Evaluation and benchmarking strategies
Concrete annotation and quality-control workflows
Excellent communication and leadership skills; comfortable setting direction and making tradeoff decisions in ambiguous, fast-changing domains

Preferred

Graduate degree (MS/PhD) in Computer Science, Machine Learning, Robotics, or related field
Experience working in or closely with a research lab or frontier AI organization focused on RL, agents, or aligned systems
Direct experience with designing RL benchmarks, simulators, or environment suites
RLHF/RLAIF pipelines or large-scale human feedback collection
Multi-agent or multi-task environments
Familiarity with game engines or simulation platforms (e.g., Unity, Unreal, MuJoCo, Isaac, Habitat, or similar)
Background in statistics and experimental design, especially for human feedback experiments
A/B testing of environment or reward variants
Experience in high-growth startup or similarly dynamic environments

Benefits

Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
Competitive compensation
Flexible working hours

Company

Turing advances frontier AI and builds real-world systems for Fortune 500 companies, governments, and the world’s leading AI labs.

H1B Sponsorship

Turing has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (8)
2023 (7)
2022 (16)
2021 (6)

Funding

Current Stage
Late Stage
Total Funding
$270.19M
Key Investors
Khazanah NasionalAltaIR CapitalWestBridge Capital
2025-03-06Series E· $111M
2021-12-07Convertible Note· $6.85M
2021-10-04Series D· $87M

Leadership Team

leader-logo
Jonathan Siddharth
Founder & CEO
linkedin
leader-logo
Vijay Krishnan
Founder & CTO
linkedin
Company data provided by crunchbase