Member of Technical Staff - Data Quality Engineer (Pre-training) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Reflection AI ยท 21 hours ago

Member of Technical Staff - Data Quality Engineer (Pre-training)

Reflection AI is on a mission to build open superintelligence and make it accessible to all. They are seeking a Data Quality Engineer to ensure that the data used to train their models meets high standards for quality and reliability, directly impacting model performance.

Computer Software
check
H1B Sponsor Likelynote

Responsibilities

Own upstream data quality for LLM pre-training; as a specialist or generalist across languages and modalities
Partner closely with research and pre-training teams to translate requirements into measurable quality signals, and provide actionable feedback to external data vendors
In addition to human-in-the-loop processes, you will design, validate, and scale automated QA methods to reliably measure data quality across large campaigns
Build reusable QA pipelines that reliably deliver high-quality data to pre-training teams for model training
Monitor and report on data quality over time, driving continuous iteration on quality standards, processes, and acceptance criteria

Qualification

PythonBuilding ML workflowsAutomated QA methodsLarge datasetsLLM familiarityAnalytical mindsetCommunicationDetail-oriented

Required

Strong engineering fundamentals with experience building data pipelines, QA systems, or evaluation workflows for pre-training data
Detail-oriented with an analytical mindset, able to identify failure modes, inconsistencies, and subtle issues that affect data quality
Solid understanding of how data quality impacts pre-training, with the ability to translate quality concerns into concrete signals, decisions, and feedback
Experience designing and validating automated quality checks, including rule-based systems, statistical methods, or model-assisted approaches such as LLM-as-a-Judge
Comfortable working autonomously, owning problems end-to-end, and collaborating effectively with researchers, engineers, and operations partners
Proficiency in Python and building ML / LLM workflows. Must be comfortable debugging and writing scalable code
Experience working with large datasets and automated evaluation or quality-checking systems
Familiarity with how LLMs work and can describe how models are trained and evaluated
Excellent communication skills with the ability to clearly articulate complex technical concepts across teams

Benefits

Comprehensive medical, dental, vision, life, and disability insurance.
Fully paid parental leave for all new parents, including adoptive and surrogate journeys.
Financial support for family planning.
Paid time off when you need it, relocation support, and more perks that optimize your time.
Lunch and dinner are provided daily.
Regular off-sites and team celebrations.

Company

Reflection AI

twitter
company-logo
Frontier open intelligence accessible to all. Our team previously built frontier LLMs at labs like DeepMind, OpenAI, and Anthropic.

H1B Sponsorship

Reflection AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)

Funding

Current Stage
Early Stage
Company data provided by crunchbase