Evaluation Scenario Writer – AI Agent Testing Specialist (Remote, Contract) jobs in United States
info-icon
This job has closed.
company-logo

OpenTrain AI · 1 week ago

Evaluation Scenario Writer – AI Agent Testing Specialist (Remote, Contract)

OpenTrain AI is seeking an Evaluation Scenario Writer to enhance the testing of LLM-based agents in realistic environments. The role involves designing structured evaluation scenarios, defining agent behavior, and collaborating with developers to refine evaluation frameworks.

Artificial Intelligence (AI)FreelanceMarketplaceOnline Portals
Hiring Manager
Akita Sanders
linkedin

Responsibilities

Design realistic, structured evaluation scenarios that simulate human-performed tasks
Define gold-standard (“gold path”) agent behavior and acceptable variations
Annotate task steps, expected outputs, edge cases, and scoring logic
Review agent outputs and iterate on scenarios for clarity, coverage, and realism
Collaborate with developers and other contributors to test and refine evaluation frameworks

Qualification

Analytical thinkingQA reasoningSoftware testingPythonJavaScriptNLP annotationStructured formatsClear documentation

Required

Strong analytical thinking and QA-style reasoning
Excellent written English and clear documentation skills
Comfort working with structured formats like JSON/YAML
Basic Python and JavaScript experience required

Preferred

Background in software testing, QA, data analysis, or NLP annotation (strongly preferred)

Company

OpenTrain AI

twittertwittertwitter
company-logo
OpenTrain AI connects companies with vetted data labeling experts, supports any annotation tool, and manages escrow payments.

Funding

Current Stage
Early Stage

Leadership Team

W
Weston Dotson
Founder
linkedin
Company data provided by crunchbase