Senior Software Engineer, AI Eval jobs in United States
cer-icon
Apply on Employer Site
company-logo

Codecov · 13 hours ago

Senior Software Engineer, AI Eval

Codecov is on a mission to help developers write better software faster. As a Senior Software Engineer on Sentry’s AI/ML team, you’ll be responsible for building the evaluation infrastructure that measures the accuracy, reliability, and real-world performance of AI systems.

Developer ToolsEnterprise SoftwareSaaSTest and Measurement

Responsibilities

Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
Partner closely with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring

Qualification

AI/ML experiencePythonTypeScriptData infrastructureEvaluation techniquesCross-functional collaborationAttention to detail

Required

Minimum 5+ years of professional experience with a Bachelor's degree in computer science, machine learning, or a related field
Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
Comfort writing production-quality code (we use Python and TypeScript)
Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)

Preferred

Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools

Benefits

Incentive compensation
Equity grants
Paid time off
Group health insurance coverage

Company

Codecov

twittertwittertwitter
company-logo
Codecov is an online platform that provides hosted testing reports and statistics for its users.

Funding

Current Stage
Early Stage
Total Funding
unknown
2022-11-30Acquired
2020-01-01Seed
Company data provided by crunchbase