Turing · 2 months ago
Frontier Data Lead - Coding
Turing is a leading research accelerator for frontier AI labs based in San Francisco, California. The Frontier Data Lead – Code will oversee the creation of datasets and reinforcement learning environments for coding agents, while managing teams and ensuring data quality for AI lab clients.
Artificial Intelligence (AI)Generative AIInformation TechnologyMachine LearningSoftware Engineering
Responsibilities
Lead the creation of datasets, rl environments, and evals focused on Coding Agents / Software Engineering for one or more AI lab customers
Ensure that everything you ship to clients meets frontier standards for realism, correctness, diversity, and difficulty
Set up quality rubrics, automated validation scripts, and human review processes for every stage of data generation
Build and lead cross-functional teams of software engineers, researchers, QAs, and data creators drawn from Turing’s 4M+ developer network
Interview, onboard, train, and mentor team members to ensure consistent output quality and technical excellence
Act as the primary technical point of contact for your customer projects, interfacing directly with researchers and engineers at frontier AI labs to understand their coding agent roadmap and model data needs, to gather feedback, and to co-define success criteria for your projects
Provide regular progress updates, surface insights from model evaluations, and incorporate client feedback to improve future iterations
Fine-tune models in-house on Turing-generated datasets or Turing-rl-environment generated trajectories to determine model improvement as a proof of data quality
Proactively build benchmarks and run evals on frontier models and coding agents to identify strengths and weaknesses on SWE tasks, and leverage these insights to inform product roadmap
Equip customer-facing teams with the Evaluation reports, sample datasets, and trainings to enable them to communicate your data offerings to customers most effectively
Publish research papers and technical posts on Turing’s data products, innovations in our synthetic data generation / automation pipelines, evaluations of frontier agents and models, and Turing’s model fine-tuning results on our datasets
Oversee development of internal tools that accelerate data generation and verification (e.g., automated data scraping pipelines, unit test generators, repo sandboxing)
Design dashboards and APIs for customers to run model evals, view performance reports, and integrate Turing data directly into their post-training pipelines
Qualification
Required
Post-training experience on SWE tasks or experience building coding agents
Engineering Management experience: have led teams of engineers in the past, including interviewing/hiring them and setting up QA processes
Hands-on technical capability: Fluency in Python and proficiency in one or more major languages (C++, Java, Go, Rust, or JS)
Operational leadership: Proven ability to manage complex data pipelines, multi-stakeholder delivery, and concurrent high-stakes projects
Cross-functional communicator: ability to communicate clearly with researchers at frontier AI labs, subject matter experts for various domains, and diverse teams
Preferred
Background in Computer Science, Machine Learning, or related technical field
Benefits
Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
Competitive compensation
Flexible working hours
Company
Turing
Turing advances frontier AI and builds real-world systems for Fortune 500 companies, governments, and the world’s leading AI labs.
H1B Sponsorship
Turing has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (8)
2023 (7)
2022 (16)
2021 (6)
Funding
Current Stage
Late StageTotal Funding
$270.19MKey Investors
Khazanah NasionalAltaIR CapitalWestBridge Capital
2025-03-06Series E· $111M
2021-12-07Convertible Note· $6.85M
2021-10-04Series D· $87M
Recent News
Foundation Capital
2025-12-31
2025-11-22
Company data provided by crunchbase