Codecov · 1 day ago
Senior Software Engineer, AI Eval
Codecov is on a mission to help developers write better software faster. As a Senior Software Engineer on Sentry’s AI/ML team, you’ll be responsible for building the evaluation infrastructure that measures the accuracy, reliability, and real-world performance of AI systems.
Developer ToolsEnterprise SoftwareSaaSTest and Measurement
Responsibilities
Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring
Qualification
Required
Minimum 5+ years of professional experience with a Bachelor's degree in computer science, machine learning, or a related field
Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
Comfort writing production-quality code (we use Python and TypeScript)
Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)
Preferred
Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools
Benefits
Incentive compensation
Equity grants
Paid time off
Group health insurance coverage
Company
Codecov
Codecov is an online platform that provides hosted testing reports and statistics for its users.
Funding
Current Stage
Early StageTotal Funding
unknown2022-11-30Acquired
2020-01-01Seed
Recent News
Company data provided by crunchbase