Freelance Agent Evaluation Analyst jobs in United States
cer-icon
Apply on Employer Site
company-logo

Toloka · 2 months ago

Freelance Agent Evaluation Analyst

Toloka AI is a company that creates data powering leading GenAI models and innovations. They are seeking a Freelance Agent Evaluation Analyst to oversee quality and insight in projects, involving critical thinking and systems-level analysis while collaborating with various stakeholders.

AnalyticsArtificial Intelligence (AI)CrowdsourcingData Collection and LabelingMachine LearningSoftware

Responsibilities

Fully own the QA pipeline for agent evaluation tasks
Review and validate tasks and golden paths created by scenario writers and experts
Spot logical inconsistencies, vague requirements, hidden risks, and unrealistic assumptions
Provide structured feedback and ensure quality alignment across contributors
Train, onboard, and mentor new QA team members
Collaborate with domain experts, delivery managers, and engineers to improve test clarity and coverage
Maintain and improve QA checklists, SOPs, and review guidelines
Contribute to test planning, prioritization, and quality benchmarks
Take initiative to suggest new approaches, tools, and processes that help scale validation and analysis

Qualification

Analytical skillsManual QA experienceCritical thinkingJSON/YAML proficiencyClear written communicationConstructive feedbackCoaching skillsStakeholder collaborationProactive attitudeAttention to detail

Required

Strong analytical and critical thinking skills
Attention to detail and reliability - your work can be trusted without double-checking
Experience in manual QA, scenario validation, or similar analytical work
Comfortable working with structured formats (JSON/YAML)
Clear written communication and documentation skills
Ability to give constructive feedback and coach others
Capable of working with a wide range of stakeholders: from engineers to directors/VPs

Preferred

Background in scenario-based testing, test design, or annotation workflows
Experience with AI/LLM evaluation, prompt validation, or agent behavior testing
Some technical independence (e.g., Python skills)
Familiarity with MCP / tool-based task execution
Experience working in cross-functional teams across product, delivery, and engineering

Benefits

Flexible payment based on the results of work
Flexibility: we offer freelance employment. You will also design with your manager a workday that works best for you
Hourly rate - 20-60 EUR per hour
Friendly community

Company

Toloka

twittertwittertwitter
company-logo
Toloka offers a data-centric environment that supports fast and scalable AI development across the ML lifecycle.

Funding

Current Stage
Growth Stage
Total Funding
$72M
Key Investors
Bezos Expeditions
2025-05-06Series Unknown· $72M

Leadership Team

leader-logo
Olga Megorskaya
CEO of Toloka.ai
linkedin
Company data provided by crunchbase