Staff Machine Learning Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Backflip AI · 2 months ago

Staff Machine Learning Data Engineer

Backflip.ai is building a foundation model for mechanical design, aiming to democratize the ability to create in the physical world. The Staff Machine Learning Data Engineer will lead the development of data pipelines that power this model, collaborating with various teams to enhance model performance through data-driven insights.

Artificial Intelligence (AI)Creative AgencyVirtual Reality

Responsibilities

Architect and own Backflip’s ML data pipeline, from ingestion to processing to evaluation
Define data strategy: establish best practices for data augmentation, filtering, and sampling at scale
Design scalable data systems for multimodal training (text, geometry, CAD, and more)
Develop and automate data collection, curation, and validation workflows
Collaborate with MLEs to design and execute experiments that measure and improve model performance
Build tools and metrics for dataset analysis, monitoring, and quality assurance
Contribute to model development through insights grounded in data, shaping what, how, and when we train

Qualification

ML data pipeline architectureData engineering for MLPythonData augmentation strategiesLarge-scale data processingData quality focusAnalytical thinkingCollaboration skills

Required

You've built and maintained ML data pipelines at scale, ideally for foundation or generative models, that shipped into production in the real world
You have deep experience with data engineering for ML, including distributed systems, data extraction, transformation, and loading, and large-scale data processing (e.g. PySpark, Beam, Ray, or similar)
You're fluent in Python and experienced with ML frameworks and data formats (Parquet, TFRecord, HuggingFace datasets, etc.)
You've developed data augmentation, sampling, or curation strategies that improved model performance
You think like both an engineer and an experimentalist: curious, analytical, and grounded in evidence
You collaborate well across AI development, infra, and product, and enjoy building the data systems that make great models possible
You care deeply about data quality, reproducibility, and scalability
You're excited to help shape the future of AI for physical design

Preferred

You are comfortable working with a variety of complex data formats, e.g. for 3D geometry kernels or rendering engines
You have an interest in math, geometry, topology, rendering, or computational geometry
You've worked in 3D printing, CAD, or computer graphics domains

Company

Backflip AI

twittertwitter
company-logo
Backflip AI is an AI platform operator designed to combine 3D generative AI with humans.

Funding

Current Stage
Early Stage
Total Funding
$30M
2024-12-19Series A· $30M

Leadership Team

leader-logo
Gregory Mark
Founder and CEO
linkedin
Company data provided by crunchbase