Apply on Employer Site

DeepRec.ai · 2 weeks ago

LLM Evaluation Engineering Lead

Redwood City, CA

Full-time

Onsite

Lead/Staff

DeepRec.ai is a deep-tech AI company focused on building autonomous systems for complex environments. They are seeking an LLM Evaluations Engineering Lead to own the evaluation and verification processes for agentic LLM systems, ensuring that these systems improve and function reliably.

Hiring Manager

Ben Reavill

Responsibilities

Build eval harnesses for agentic LLM systems (offline + in-workflow)

Design evals for planning, execution, recovery, and safety

Implement verifier-driven scoring and regression gates

Turn eval failures into training signals (SFT / DPO / RL)

Qualification

Evaluation systems for MLPythonData pipelinesTest harnessesDistributed executionReproducibilityAgentic failure modesReasoning about measurementsResearch experimentationProduction systems