Replit · 1 month ago
Data Scientist, AI Agent
Replit is the agentic software creation platform that enables anyone to build applications using natural language. The role involves directly impacting Replit's AI agent by defining success metrics, designing experiments, and turning data into actionable insights for the AI team and leadership.
Artificial Intelligence (AI)Cloud ComputingDeveloper ToolsInformation TechnologySoftware
Responsibilities
Design and analyze experiments to measure agent improvements—from model changes to UX variations—with statistical rigor and practical tradeoffs
Define success metrics that connect agent trace data (prompts, responses, code changes, execution outcomes) to user outcomes like successful deploys, retention, and revenue
Build the semantic layer for agent data in partnership with data engineering—defining the tables, metrics, and models that enable self-serve analysis across the AI team
Surface insights from trace analysis that identify failure modes, successful patterns, and opportunities to improve agent effectiveness
Partner with AI engineering, product, and leadership to translate data into roadmap decisions; you'll have a seat at the table for critical agent strategy discussions
Create dashboards and reporting that surface agent performance metrics (task completion, latency, quality scores, user satisfaction) for the AI team and executives
Design an experiment to measure whether a new model improves task completion rates, accounting for user heterogeneity and novelty effects
Build outcome-linked data models that connect agent trajectories to downstream success (deployments, user satisfaction, retention)
Develop evaluation frameworks for agent quality that can be reused as benchmarks—similar to how LLMs have standard evals
Investigate why agent performance varies across coding tasks, languages, or user segments—and recommend targeted improvements
Qualification
Required
5+ years of experience in data science, analytics, or a quantitative role with a focus on product, growth, or experimentation
Deep experimentation expertise: A/B testing, experiment design, power analysis, handling skewed data, interpreting results beyond p-values
Strong SQL skills and experience designing data models for high-volume event data; experience with dbt or similar transformation tools
Proficiency in Python and data science libraries (pandas, scipy, statsmodels, etc.)
Ability to translate ambiguous questions into structured analysis and communicate findings clearly to both technical and non-technical stakeholders
Bias toward action: you ship insights that influence decisions, not just dashboards
Preferred
Experience with LLM or AI agent evaluation—understanding of prompt-response patterns, agent evaluation frameworks, or model quality measurement
Background in high-growth SaaS or PLG companies with large-scale event data
Experience with modern data stack (BigQuery, dbt, Fivetran, Segment, Hex)
Familiarity with experimentation platforms (LaunchDarkly, Statsig, Eppo, or similar)
Understanding of developer tools or software engineering workflows
You've built agent or LLM evaluation frameworks from scratch
Experience with causal inference methods (difference-in-differences, synthetic control, CUPED)
Familiarity with real-time data systems or operational analytics for monitoring agent performance
Experience working with trace data, logging systems, or observability tooling
Benefits
💰 Competitive Salary & Equity
💹 401(k) Program
⚕️ Health, Dental, Vision and Life Insurance
🩼 Short Term and Long Term Disability
🚼 Paid Parental, Medical, Caregiver Leave
🚗 Commuter Benefits
📱 Monthly Wellness Stipend
🧑💻 Autonoumous Work Environement
🖥 In Office Set-Up Reimbursement
🏝 Flexible Time Off (FTO) + Holidays
🚀 Quarterly Team Gatherings
☕ In Office Amenities
Company
Replit
Replit is the most secure agentic platform for production-ready apps.
H1B Sponsorship
Replit has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8)
2024 (5)
2023 (2)
2022 (2)
Funding
Current Stage
Growth StageTotal Funding
$472.02MKey Investors
Prysm CapitalCraft VenturesAndreessen Horowitz
2025-07-30Series C· $250M
2023-11-06Series B· $20M
2023-04-25Series B· $97.4M
Recent News
2026-01-09
2026-01-09
Company data provided by crunchbase