LMArena · 3 weeks ago
Data Scientist
LMArena is the open platform for evaluating how AI models perform in the real world. As a Data Scientist, you will explore and reason about data that powers AI evaluations, collaborating with ML researchers and engineers to design experiments and analyze large-scale datasets.
Artificial Intelligence (AI)Information ServicesMachine LearningProduct Research
Responsibilities
Explore and analyze large, complex datasets to uncover patterns, biases, and causal relationships in model behavior and system performance
Formulate hypotheses about data quality, evaluation outcomes, and model performance — then design experiments to validate or refute them
Build reproducible analysis pipelines using Python, Pandas, NumPy, and Spark to process and interrogate large-scale data
Partner with ML researchers and engineers to design metrics and analyses that evaluate how models perform across domains, prompts, and tasks
Develop causal reasoning frameworks and statistical methods that help explain why models behave as they do — not just how well they perform
Communicate insights (for example, via blog posts) clearly to technical and non-technical partners, informing both research direction and infrastructure improvements
Qualification
Required
6+ years of experience in data science, ML analytics, or applied research, preferably in AI, ML, or large-scale data environments
Strong proficiency in Python, with deep experience in Pandas, NumPy, and distributed frameworks like Spark
Expertise in statistical modeling, causal inference, and experimental design
Experience reasoning about data distributions, sample quality, and the effects of data distribution shifts
Strong communication skills and the ability to collaborate closely with ML researchers and engineers
Preferred
Background in AI model evaluation
Experience working with LLM outputs (for example, LLM-as-a-judge), embeddings, or other large-scale model artifacts
Experience with A/B testing
Benefits
Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs.
Company
LMArena
LMArena is a web-based platform that evaluates large language models (LLMs) through anonymous, crowd-sourced pairwise comparisons.
Funding
Current Stage
Early StageTotal Funding
$250M2026-01-06Series A· $150M
2025-05-21Seed· $100M
Recent News
Sourcery
2026-01-15
Tech Startups - Tech News, Tech Trends & Startup Funding
2026-01-09
Tech Startups - Tech News, Tech Trends & Startup Funding
2026-01-09
Company data provided by crunchbase