P-1 AI · 8 hours ago
AI Evals Technical Lead
P-1 AI is focused on building engineering AGI, with a goal to integrate AI into the industrial sector. The AI Evals Technical Lead will be responsible for developing and validating evaluation tests to ensure the AI engineer, Archie, meets industry skill expectations while collaborating with a team of experts.
Artificial Intelligence (AI)SoftwareWeb Development
Responsibilities
Implement the system for organizing, transforming, running, grading, and reporting on eval benchmarks
Design and execute the process by which we develop and QA our evals, incorporating contributions from our own engineering team, industrial partners, and subject-matter experts
Ensure that evals run effectively within our CI/CD system, continuously benchmarking our evolving AI platform and the experiments we’re performing around it
Create methods for detecting and testing for common quality challenges of AI, including hallucinations, undesirable stochasticity, and regressions
Be a technical leader in the consistent implementation and organization of automated tests across other areas of our technology stacks
Qualification
Required
Experience in constructing comprehensive test suites for software and/or AI systems, including coordinating the contributions of others
Experience designing metrics to evaluate systems and visualize their performance, including differences across successive generations
Experience in developing, managing, and running evals against LLM-based systems is a strong plus
Good communication skills with a variety of stakeholders (AI researchers, domain experts, application developers)
Proficiency in Python programming, complex modules and modern software development tools and practices (Git, CI/CD, etc.)
Ability to thrive in a fast-paced, dynamic startup environment
Company
P-1 AI
P-1 AI is a technology company focused on developing an artificial general engineering intelligence (AGEI).
Funding
Current Stage
Early StageTotal Funding
$23MKey Investors
Radical VenturesVillage Global
2025-04-28Seed· $23M
2024-07-30Pre Seed
Recent News
Mexico Business
2026-01-15
Company data provided by crunchbase