Tech Lead / Manager, AI Evaluation Science jobs in United States
cer-icon
Apply on Employer Site
company-logo

Diligent Robotics · 2 months ago

Tech Lead / Manager, AI Evaluation Science

Diligent Robotics is a company that envisions a future powered by robots that work seamlessly with human teams. They are seeking a Tech Lead / Manager for AI Evaluation Science to lead the team responsible for measuring and validating the performance of physical AI systems, ensuring safety and reliability of their robots in real-world scenarios.

Artificial Intelligence (AI)Human Computer InteractionMachine LearningRoboticsSoftware
check
H1B Sponsor Likelynote

Responsibilities

Lead the AI Evaluation Science team, owning evaluation strategy for robot perception, planning, control, and multimodal models
Define metrics and benchmarks for AI performance across safety, reliability, user experience, and robustness
Develop and maintain large-scale simulation environments to test robot behaviors under diverse real-world conditions (edge cases, adversarial scenarios, rare failures)
Design evaluation frameworks that cover offline experiments, simulation, and live deployments
Build scalable pipelines for test coverage, automated evaluation, and regression tracking
Oversee labeling and data curation pipelines to generate high-quality ground truth for training and validation
Drive interpretability and explainability in embodied AI models—ensuring failures are measurable, diagnosable, and improvable
Collaborate closely with AI/Robotics engineering teams to define product requirements, set acceptance thresholds, and close the loop between evaluation and development
Actively mentor engineers and scientists while contributing hands-on to code, experiments, and metrics design

Qualification

AI/ML experienceMultimodal ML modelsEvaluation pipelinesLeadership experienceRobotics evaluationSimulation platformsML interpretabilityCross-functional alignmentCommunication

Required

MS or PhD in Computer Science, Robotics, ML, EE, or related field along with 8+ years of AI/ML experience
Proven leadership experience: built and managed technical teams in AI, simulation, or robotics evaluation
Hands-on expertise building and evaluating large multimodal ML models (vision, language, action)
Strong background in defining and operationalizing metrics for AI/robotics systems (safety, robustness, reliability)
Demonstrated success in designing end-to-end evaluation pipelines: from data labeling and test definition to automated reporting and regression tracking
Experience in evaluation, benchmarking, or safety in robotics, AVs, or similar domains
Experience with simulation platforms for robotics or AVs
Technical depth in ML interpretability, error analysis, and data-driven model improvement
Ability to operate in a startup context: strategic, but hands-on in code and experimentation
Excellent communication and cross-functional alignment skills—able to articulate risks, metrics, and trade-offs to executives, engineers, and non-technical stakeholders

Company

Diligent Robotics

twittertwittertwitter
company-logo
Diligent Robotics develops AI-powered robot assistants to collaborate with and adapt to humans in everyday environments.

H1B Sponsorship

Diligent Robotics has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (3)
2023 (5)
2022 (3)

Funding

Current Stage
Growth Stage
Total Funding
$90.82M
Key Investors
Canaan PartnersTiger Global ManagementCedars-Sinai Accelerator
2025-02-27Series Unknown· $10.5M
2023-09-21Series Unknown· $33.75M
2022-04-11Series B· $30M

Leadership Team

leader-logo
Andrea Thomaz
Founder, CEO
linkedin
leader-logo
Vivian Chu
Co-Founder & Chief Innovation Officer
linkedin
Company data provided by crunchbase