This job has closed.

OpenTrain AI · 1 week ago

Evaluation Scenario Writer – AI Agent Testing Specialist (Remote, Contract)

United States

Contract

Remote

New Grad, Entry Level

OpenTrain AI is seeking an Evaluation Scenario Writer to enhance the testing of LLM-based agents in realistic environments. The role involves designing structured evaluation scenarios, defining agent behavior, and collaborating with developers to refine evaluation frameworks.

Artificial Intelligence (AI)FreelanceMarketplaceOnline Portals

Hiring Manager

Akita Sanders

Responsibilities

Design realistic, structured evaluation scenarios that simulate human-performed tasks

Define gold-standard (“gold path”) agent behavior and acceptable variations

Annotate task steps, expected outputs, edge cases, and scoring logic

Review agent outputs and iterate on scenarios for clarity, coverage, and realism

Collaborate with developers and other contributors to test and refine evaluation frameworks

Qualification

Analytical thinkingQA reasoningSoftware testingPythonJavaScriptNLP annotationStructured formatsClear documentation

Required

Strong analytical thinking and QA-style reasoning

Excellent written English and clear documentation skills

Comfort working with structured formats like JSON/YAML

Basic Python and JavaScript experience required

Preferred

Background in software testing, QA, data analysis, or NLP annotation (strongly preferred)

Company

OpenTrain AI

OpenTrain AI connects companies with vetted data labeling experts, supports any annotation tool, and manages escrow payments.

Founded in 2022

Seattle, Washington, USA

2-10 employees

https://www.opentrain.ai

Funding

Current Stage

Early Stage

Leadership Team

Weston Dotson

Founder

Company data provided by crunchbase