Applied AI Researcher, Benchmarking jobs in United States
cer-icon
Apply on Employer Site
company-logo

Distyl · 1 week ago

Applied AI Researcher, Benchmarking

Distyl is a company that develops AI native technologies for collaboration between humans and AI, significantly impacting operations in large enterprises. They are seeking an Applied AI Researcher to join their Benchmarking team, responsible for designing evaluation frameworks and benchmarks to measure the performance of intelligent systems.

Artificial Intelligence (AI)Generative AIInformation TechnologySoftware
check
H1B Sponsor Likelynote

Responsibilities

The Benchmarking team defines how progress is measured. Researchers design evaluation frameworks that capture reasoning depth, interaction quality, reliability, and operational impact. They construct benchmarks that reflect real-world complexity. Their systems become the standard by which new architectures, techniques, and releases are judged
Researchers in Benchmarking explore new paradigms for evaluating intelligent systems: adversarial robustness testing, longitudinal performance tracking, and human-in-the-loop assessment. They investigate how metrics shape model behavior and establish rigorous methodologies for quantifying emergent capability. Their insights drive both Distyl’s internal research priorities and industry-wide standards

Qualification

Benchmark DesignStatistical AnalysisAI Systems DevelopmentResearch PublicationProgramming SkillsData AnalysisAI Tool UtilizationCreative Problem SolvingCollaboration

Required

Experience Designing and Running Evaluations: You've built or maintained benchmarks, test suites, or experimental frameworks to measure model or system performance
Statistical and Analytical Rigor: You design fair, reproducible experiments and can extract signal from noisy empirical results
Experience Building with Models, Not Just Building Models: We develop intelligent systems using models rather than training or fine-tuning them. Ideal candidates have expertise in compound AI systems, agentic collaboration, and associated techniques (ensembling, ReAct, graph-of-thoughts, etc.)
Proven Track Record of Research Results: Whether you've published in top journals, posted amazing work on twitter, or somewhere else we want to see what you've done
Uses AI Every Day: Before you can revolutionize someone else's workflow, you need to revolutionize yours. You should be using tools like ChatGPT, Cursor, and Perplexity to accelerate your workflow
Strong Programming and Data Analysis Skills: While you might not consider yourself a software engineer you need to be able to build prototypes of your ideas and then perform the experiments to prove the effectiveness to a F500 Head of AI
Biases Towards Showing vs Telling: Our customers want to see the power of AI today vs discuss the most elegant idea that will take 5 years to realize

Benefits

Equity options
Medical/dental/vision covered at 100% for you and your dependents
401K plan
Commuter benefits
Lunch provided in office

Company

Distyl

twittertwittertwitter
company-logo
Distyl AI partners with blue-chip leaders to help them create the enterprises of the future.

H1B Sponsorship

Distyl has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (7)

Funding

Current Stage
Growth Stage
Total Funding
$202M
Key Investors
Lightspeed Venture Partners
2025-09-22Series B· $175M
2024-09-24Series A· $20M
2023-04-13Seed· $7M

Leadership Team

leader-logo
Arjun Prakash
Co-Founder, CEO
linkedin
leader-logo
Derek Ho
Co-Founder, COO
linkedin
Company data provided by crunchbase