Staff AI Research Scientist - Evaluation, Handshake AI jobs in United States
cer-icon
Apply on Employer Site
company-logo

Handshake · 2 weeks ago

Staff AI Research Scientist - Evaluation, Handshake AI

Handshake is the career network for the AI economy, connecting knowledge workers, educational institutions, and employers to power career discovery and hiring. The Staff AI Research Scientist will lead research on evaluating large language models, develop benchmarks, and collaborate with engineers to translate research into scalable solutions.

College RecruitingData Collection and LabelingEmploymentHuman ResourcesRecruiting
check
H1B Sponsor Likelynote

Responsibilities

Lead teams of researchers to produce original research in LLM evaluation methodologies, interpretability, and human-AI knowledge alignment
Develop novel frameworks and assessment techniques that reveal deep insights into model capabilities, limitations, and emergent behaviors
Collaborate with engineers to translate research breakthroughs into scalable benchmarks, evaluation systems, and standards
Pioneer new approaches to measuring reasoning, alignment, and trustworthiness in frontier AI systems
Author high-quality code to enable large-scale experimentation, reproducible evaluation, and knowledge assessment workflows
Publish in top-tier conferences and journals, establishing new directions in the science of AI evaluation
Work cross-functionally with leadership, engineers, and external partners to set industry standards for responsible AI evaluation and alignment

Qualification

Machine LearningLLM ResearchPythonPyTorchEvaluation MethodologiesInterpretabilityBenchmark DevelopmentResearch PublicationCommunication SkillsTeam Leadership

Required

PhD or equivalent research experience in machine learning, computer science, cognitive science, or related fields with focus on AI evaluation, interpretability, or model understanding
6+ years of academic or industry experience post-doc in a research-first environment
Strong background in LLM research, evaluation methodologies, and/or foundational AI assessment techniques
Proven ability to independently design, lead, and execute evaluation research programs with novel data types end-to-end
Deep proficiency in Python and PyTorch for large-scale model analysis, benchmarking, and evaluation
Experience building or leading novel benchmark development, systematic model assessment, or interpretability studies
Strong publication record in post-training, evaluation, or interpretability that demonstrates field-defining contributions
Ability to clearly communicate complex insights and influence both technical and non-technical stakeholders

Preferred

Experience with RLHF, agent modeling, or AI alignment research
Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems
Understanding of challenges in scaling foundation models (training stability, safety, inference efficiency)
Contributions to open-source libraries or research tooling
Interest in the societal impact, deployment ethics, and governance of frontier AI systems

Benefits

Ownership: Equity in a fast-growing company
Financial Wellness : 401(k) match, competitive compensation, financial coaching
Family Support: Paid parental leave, fertility benefits, parental coaching
Wellbeing: Medical, dental, and vision, mental health support, $500 wellness stipend
Growth: $2,000 learning stipend, ongoing development
Remote & Office: Internet, commuting, and free lunch/gym in our SF office
Time Off: Flexible PTO, 15 holidays + 2 flex days
Connection: Team outings & referral bonuses

Company

Handshake

company-logo
Handshake is a college career network that helps students and recent graduates find their next opportunity.

H1B Sponsorship

Handshake has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (24)
2024 (14)
2023 (7)
2022 (15)
2021 (6)
2020 (4)

Funding

Current Stage
Late Stage
Total Funding
$434M
Key Investors
Notable CapitalEQT VenturesSpark Capital
2022-01-19Series F· $200M
2021-05-12Series E· $80M
2020-10-20Series D· $80M

Leadership Team

leader-logo
Garrett Lord
CEO - Co-Founder
linkedin
leader-logo
Ben Christensen
Co-Founder
linkedin
Company data provided by crunchbase