Apply on Employer Site

DeepRec.ai · 1 week ago

AI Evaluation Engineer

United States

Full-time

Remote

Mid Level

$180K/yr - $180K/yr

DeepRec.ai is a mission-driven tech company focused on building AI-enabled products for governments and nonprofits. They are seeking an AI Evaluation Engineer to design and own evaluation systems that ensure AI features are safe and reliable before deployment.

Hiring Manager

Ben Reavill

Responsibilities

Own the evaluation stack – design frameworks that define “good,” “risky,” and “catastrophic” outputs

Automate at scale – build data pipelines, LLM judges, and integrate with CI to block unsafe releases

Stress testing – red team AI systems with challenge prompts to expose brittleness, bias, or jailbreaks

Track and monitor – establish model/prompt versioning, build observability, and create incident response playbooks

Empower others – deliver tooling, APIs, and dashboards that put eval into every engineer’s workflow

Qualification

OpenAI APILLM ecosystemsSoftware engineeringStatistical analysisData quality validationTypeScriptPrompting techniquesMonitoring toolsData science tooling