Diligent Robotics · 2 months ago
Tech Lead / Manager, AI Evaluation Science
Diligent Robotics is a company that envisions a future powered by robots that work seamlessly with human teams. They are seeking a Tech Lead / Manager for AI Evaluation Science to lead the team responsible for measuring and validating the performance of physical AI systems, ensuring safety and reliability of their robots in real-world scenarios.
Artificial Intelligence (AI)Human Computer InteractionMachine LearningRoboticsSoftware
Responsibilities
Lead the AI Evaluation Science team, owning evaluation strategy for robot perception, planning, control, and multimodal models
Define metrics and benchmarks for AI performance across safety, reliability, user experience, and robustness
Develop and maintain large-scale simulation environments to test robot behaviors under diverse real-world conditions (edge cases, adversarial scenarios, rare failures)
Design evaluation frameworks that cover offline experiments, simulation, and live deployments
Build scalable pipelines for test coverage, automated evaluation, and regression tracking
Oversee labeling and data curation pipelines to generate high-quality ground truth for training and validation
Drive interpretability and explainability in embodied AI models—ensuring failures are measurable, diagnosable, and improvable
Collaborate closely with AI/Robotics engineering teams to define product requirements, set acceptance thresholds, and close the loop between evaluation and development
Actively mentor engineers and scientists while contributing hands-on to code, experiments, and metrics design
Qualification
Required
MS or PhD in Computer Science, Robotics, ML, EE, or related field along with 8+ years of AI/ML experience
Proven leadership experience: built and managed technical teams in AI, simulation, or robotics evaluation
Hands-on expertise building and evaluating large multimodal ML models (vision, language, action)
Strong background in defining and operationalizing metrics for AI/robotics systems (safety, robustness, reliability)
Demonstrated success in designing end-to-end evaluation pipelines: from data labeling and test definition to automated reporting and regression tracking
Experience in evaluation, benchmarking, or safety in robotics, AVs, or similar domains
Experience with simulation platforms for robotics or AVs
Technical depth in ML interpretability, error analysis, and data-driven model improvement
Ability to operate in a startup context: strategic, but hands-on in code and experimentation
Excellent communication and cross-functional alignment skills—able to articulate risks, metrics, and trade-offs to executives, engineers, and non-technical stakeholders
Company
Diligent Robotics
Diligent Robotics develops AI-powered robot assistants to collaborate with and adapt to humans in everyday environments.
H1B Sponsorship
Diligent Robotics has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (3)
2023 (5)
2022 (3)
Funding
Current Stage
Growth StageTotal Funding
$90.82MKey Investors
Canaan PartnersTiger Global ManagementCedars-Sinai Accelerator
2025-02-27Series Unknown· $10.5M
2023-09-21Series Unknown· $33.75M
2022-04-11Series B· $30M
Recent News
Silicon Prairie News
2025-10-30
MarketScreener
2025-10-29
Company data provided by crunchbase