Member of Technical Staff - Data Ingestion Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Reflection AI ยท 10 hours ago

Member of Technical Staff - Data Ingestion Engineer

Reflection AI is dedicated to building open superintelligence and making it accessible to all. They are seeking a Data Ingestion Engineer to build and operate ingestion systems that transform large-scale data sources into structured corpora for training AI models, collaborating closely with researchers and data quality teams.

Computer Software
check
H1B Sponsor Likelynote

Responsibilities

Build and operate large-scale data ingestion systems for pre-training, including web crawling, extraction, and dataset delivery
Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs
Analyze ingested data to identify gaps, redundancy, and areas to improve
Build ingestion pipelines that scale reliably across large data campaigns
Develop specialized crawlers for high-priority data sources
Review code, debug production issues, and continuously improve ingestion infrastructure

Qualification

Data ingestion systemsWeb crawlingLarge-scale data acquisitionRayBeamSparkLLM training knowledgeExperiment designCommunication

Required

Experience building web crawling, data ingestion, or large-scale data acquisition systems using Ray, Beam, Spark, or similar technologies
Familiarity with how LLMs are trained and evaluated, and an intuition for what makes data useful for training
Comfortable working with very large datasets (multi-TB to PB scale) and building systems that are observable, testable, and maintainable
Comfortable designing experiments and using data to guide system improvements
Excellent communication skills. You can explain system behavior. You consider and communicate tradeoffs clearly

Benefits

Comprehensive medical, dental, vision, life, and disability insurance.
Fully paid parental leave for all new parents, including adoptive and surrogate journeys.
Financial support for family planning.
Paid time off when you need it.
Relocation support.
Lunch and dinner are provided daily.
Regular off-sites and team celebrations.

Company

Reflection AI

twitter
company-logo
Frontier open intelligence accessible to all. Our team previously built frontier LLMs at labs like DeepMind, OpenAI, and Anthropic.

H1B Sponsorship

Reflection AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)

Funding

Current Stage
Early Stage
Company data provided by crunchbase