Checkmate · 2 weeks ago
Staff Data Scientist - Product Experimentation & Evaluation - US
Checkmate is seeking a Staff Data Scientist to act as a strategic partner across product, engineering, and trust and safety teams. The role involves defining evaluation frameworks, leading experiments, and driving product improvements through data science methodologies.
Responsibilities
Lead end-to-end experimentation: hypothesis generation, metric design, experiment design (A/B, multivariate, sequential, etc.), analysis, and interpretation
Build and maintain evaluation frameworks for LLMs: correctness, consistency, safety, hallucination detection, bias/fairness, etc
Develop predictive models, classification/ranking systems, and heuristics to improve product features related to AI/language generation
Collaborate with prompt engineers & model builders to test prompt strategies, fine-tuning, or model selection; work on failure modes/error analysis
Automate experiment pipelines: dashboards, monitoring, alerting, instrumentation. Ensure data quality & measurement integrity
Use causal inference / observational studies when randomized experiments are not feasible
Present findings and recommendations to both technical and non-technical leadership; influence roadmap decisions
Drive experimentation in startup-like environments: rapid iteration, learning from limited data, and balancing speed with rigor
Shape large-scale product experimentation: define frameworks for experimentation at scale and integrate results into product strategy
Lead and mentor teams of data scientists, analysts, and engineers; set best practices for experiment design and AI product evaluation
Qualification
Required
8-12+ years of experience in data science / ML roles, ideally with experiment design/product analytics
Proven track record in both startup-style and large-scale product experimentation
Experience leading teams, setting strategy, and driving execution in cross-functional environments
Strong background with statistical methods, causal inference, and rigorous measurement
Experience using LLMs / NLP / AI / prompt engineering or a closely related field
Excellent coding skills in Python (or similar), strong SQL, experience building and deploying models or analytic pipelines
Ability to work in cross-functional teams, translate technical results into business or product changes
Strong communication skills; ability to explain complex analyses to non-technical stakeholders
Preferred
Experience fine-tuning or working with multiple LLM providers / APIs
Experience with experiment platforms or building internal tooling for experimentation & model evaluation
Experience in voice / ASR or other multi-modal data
Benefits
Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k)
Life Insurance (Basic, Voluntary & AD&D)
Flexible Paid Time Off
Family Leave (Maternity, Paternity)
Short Term & Long Term Disability
Training & Development
Work From Home
Stock Option Plan
Company
Checkmate
Checkmate is a service in integrating 3rd party delivery platforms directly into POS systems.
H1B Sponsorship
Checkmate has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)
Funding
Current Stage
Growth StageTotal Funding
$13MKey Investors
Tiger Global Management
2024-10-23Series B· $10M
2018-11-12Series A· $3M
Recent News
Company data provided by crunchbase