Bespoke Labs · 5 hours ago
Data Scientist
Bespoke Labs is a venture-backed startup focused on building AI-native systems and next-generation digital products. They are seeking a Senior Data Scientist to build, deploy, and manage production-grade machine learning systems while contributing to AI training data strategy and model improvement. The role involves core applied data science and hands-on work with high-quality AI training datasets.
Responsibilities
Own the full DS lifecycle: problem framing → modeling → validation → production → iteration
Build and productionize statistical and ML models on large-scale datasets
Develop scalable feature pipelines and model workflows using Apache Spark
Partner with Data Engineers to deploy models into batch and real-time systems
Design and analyze experiments (A/B tests, causal inference) to measure impact
Translate model outputs into clear product and business decisions
Design and curate high-quality training datasets to improve model performance
Define data quality metrics, evaluation criteria, and error analysis frameworks
Work on data generation, filtering, and enrichment strategies for ML training
Analyze model failures and feed insights back into training data and feature design
Collaborate with ML teams on model retraining, fine-tuning, and iteration loops
Help establish best practices for scalable AI training pipelines
Qualification
Required
6+ years of experience as a Data Scientist or Applied Scientist
Proven ownership of models running in production
Strong foundation in applied statistics and experimentation
Hands-on experience with large-scale data processing
Clear examples of measurable business or product impact
Python (NumPy, Pandas, Scikit-learn, PyTorch / TensorFlow)
Apache Spark (PySpark or Spark SQL) for large-scale data processing
Strong SQL
Feature engineering and model evaluation
Statistical modeling and hypothesis testing
Experimentation frameworks and A/B testing
Working with training datasets and model evaluation loops
Preferred
Models trained on TB-scale datasets
Spark-based feature pipelines or offline training workflows
Experience deploying models alongside data engineering pipelines
Hands-on involvement in AI training data curation or model fine-tuning
Experience in domains such as: Recommendations, Pricing, Fraud / Risk, Search / Ranking, Growth & Experimentation
Ability to clearly explain models and data decisions to non-technical stakeholders
Company
Bespoke Labs
RL for Agents
Funding
Current Stage
Early StageCompany data provided by crunchbase