Salesforce · 1 day ago
Director, Agentforce Testing Center Engineering
Salesforce is the #1 AI CRM, where humans with agents drive customer success together. We are looking for a technical leader who will lead the team responsible for defining what 'good' looks like for agents, focusing on rigorous evaluations that connect agent specifications to business outcomes.
Agentic AIArtificial Intelligence (AI)Cloud ComputingCRMSaaSSales EnablementSoftware
Responsibilities
Build the "Evaluation Core": Lead the engineering of a scalable evaluation platform that runs in parallel with agent execution
Thread Science & Engineering: Operationalize applied science by turning theoretical benchmarks into production regression tests and bring about a discipline of eval driven development
Thought Leadership: Act as the internal SME for AI testing. Educate cross-functional partners (Product, UX, ML) on the difference between stochastic AI behavior and traditional deterministic software
You are an Engineering leader who can lead the group through technical leadership, process management, maintain a good discipline of high quality code delivery aided with AI tools as necessary
You are a People leader who ensures teams have clear priorities and adequate resources. You are a multiplier and have a passion for team and team members’ success providing technical guidance, career development, and mentoring
Qualification
Required
Specialized Agent Evaluation Experience: You have specific experience building evaluation harnesses for LLMs or Agents
Applied Science & Engineering Hybrid: You have a track record of managing 'Research Engineering' or 'Applied Science' teams where you had to operationalize vague scientific goals into shipping code. You are comfortable curating 'Golden Sets' of data and building custom benchmarks from scratch
Deep Knowledge of Eval Methodologies: You are fluent in modern evaluation techniques, including:
LLM-as-a-Judge: Validating judges against human ground truth to prevent self-bias
Behavioral Analysis: Evaluating how an agent thinks (Reasoning Traces/Chain of Thought), not just the final output
Production-Grade AI Experience: You have shipped AI products where you had to manage real-world constraints like token budgets, inference latency, and cost-normalized accuracy. Pragmatic orientation to building ML solutions that work in production at scale
Familiarity with academic and industry benchmarks and their limitations in a business environment
Experience building simulation environments (mock APIs, virtual users) to stress-test agents safely before deployment
Experience with Data engineering, specifically around data acquisition, creating data pipelines, metric measurement, and analysis
Experience owning highly available services and putting processes in place to maintain uptime
Prior experience working with global teams
Strong verbal and written communication skills, organizational and time management skills
Advanced degree in Computer Science, Machine Learning, or related field with a focus on system evaluation or reliability
Benefits
Time off programs
Medical
Dental
Vision
Mental health support
Paid parental leave
Life and disability insurance
401(k)
Employee stock purchasing program
Company
Salesforce
Salesforce is a cloud-based software company that provides customer relationship management software and applications.
H1B Sponsorship
Salesforce has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1883)
2024 (2296)
2023 (1850)
2022 (2849)
2021 (2124)
2020 (1960)
Funding
Current Stage
Public CompanyTotal Funding
$65.38MKey Investors
Starboard ValueEmergence CapitalHalsey Minor
2022-10-18Post Ipo Equity
2004-06-23IPO
2003-01-01Series Unknown· $1M
Leadership Team
Recent News
2026-01-24
2026-01-23
Company data provided by crunchbase