Senior Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Bespoke Labs · 6 hours ago

Senior Data Engineer

BespokeLabs is a premier, VC-backed AI Research lab with an exceptionally talent-dense team of IIT and Ivy League alumni. They are seeking a top-tier Senior/Staff Data Engineer for a high-impact, 2-month sprint to architect and build complex curation systems required for advanced AI model training.

Computer Software

Responsibilities

Architect AI-Scale Systems: Design the overarching data architecture and processing topology needed to programmatically curate and shape datasets at TB/PB scale, ensuring low latency and high consistency
Hands-On Development: Write production-grade code (Python/Scala, Spark) to build custom ingestion logic, highly efficient transformation scripts, and performant data validation checks
Complex Data Logic: Implement advanced filtering, deduplication, and quality-scoring algorithms at scale, ensuring the resulting data objects are optimized for LLM/ML consumption
Quality & Performance Tuning: Rigorously test, benchmark, and optimize processing workloads (CPU/memory tuning, partitioning strategies in Spark/Iceberg) to meet aggressive throughput targets
Domain Subject Matter Expert: Act as the ultimate technical authority on distributed systems, data processing, and cloud structures to ensure the training data factory meets enterprise-grade accuracy

Qualification

Data EngineeringPythonScalaSparkKafkaAirflowCloud WarehousesLakehouse ArchitectureReliability EngineeringEnd-to-End OwnershipHigh-Throughput ProcessingDomain Expertise

Required

Experience: 6+ years of Data Engineering experience
Seniority: Demonstrated Senior/Staff-level ownership of production data platforms
Pedigree: Background at Tier-1 enterprises (FAANG, large SaaS, Fortune 100)
Technical Stack: Deep fluency in Python/Scala, Spark, Kafka, Airflow, and Major Cloud Warehouses (Snowflake, BigQuery, Redshift)

Company

Bespoke Labs

twitter
company-logo
RL for Agents

Funding

Current Stage
Early Stage
Company data provided by crunchbase