Apply on Employer Site

Handshake · 2 weeks ago

Staff AI Research Scientist - Evaluation, Handshake AI

New York, NY

Full-time

Onsite

Senior Level, Lead/Staff

6+ years exp

Handshake is the career network for the AI economy, connecting knowledge workers, educational institutions, and employers to power career discovery and hiring. The Staff AI Research Scientist will lead research on evaluating large language models, develop benchmarks, and collaborate with engineers to translate research into scalable solutions.

College RecruitingData Collection and LabelingEmploymentHuman ResourcesRecruiting

H1B Sponsor Likely

Responsibilities

Lead teams of researchers to produce original research in LLM evaluation methodologies, interpretability, and human-AI knowledge alignment

Develop novel frameworks and assessment techniques that reveal deep insights into model capabilities, limitations, and emergent behaviors

Collaborate with engineers to translate research breakthroughs into scalable benchmarks, evaluation systems, and standards

Pioneer new approaches to measuring reasoning, alignment, and trustworthiness in frontier AI systems

Author high-quality code to enable large-scale experimentation, reproducible evaluation, and knowledge assessment workflows

Publish in top-tier conferences and journals, establishing new directions in the science of AI evaluation

Work cross-functionally with leadership, engineers, and external partners to set industry standards for responsible AI evaluation and alignment

Qualification

Machine LearningLLM ResearchPythonPyTorchEvaluation MethodologiesInterpretabilityBenchmark DevelopmentResearch PublicationCommunication SkillsTeam Leadership

Required

PhD or equivalent research experience in machine learning, computer science, cognitive science, or related fields with focus on AI evaluation, interpretability, or model understanding

6+ years of academic or industry experience post-doc in a research-first environment

Strong background in LLM research, evaluation methodologies, and/or foundational AI assessment techniques

Proven ability to independently design, lead, and execute evaluation research programs with novel data types end-to-end

Deep proficiency in Python and PyTorch for large-scale model analysis, benchmarking, and evaluation

Experience building or leading novel benchmark development, systematic model assessment, or interpretability studies

Strong publication record in post-training, evaluation, or interpretability that demonstrates field-defining contributions

Ability to clearly communicate complex insights and influence both technical and non-technical stakeholders

Preferred

Experience with RLHF, agent modeling, or AI alignment research

Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems

Understanding of challenges in scaling foundation models (training stability, safety, inference efficiency)

Contributions to open-source libraries or research tooling

Interest in the societal impact, deployment ethics, and governance of frontier AI systems

Benefits

Ownership: Equity in a fast-growing company

Financial Wellness : 401(k) match, competitive compensation, financial coaching

Family Support: Paid parental leave, fertility benefits, parental coaching

Wellbeing: Medical, dental, and vision, mental health support, $500 wellness stipend

Growth: $2,000 learning stipend, ongoing development

Remote & Office: Internet, commuting, and free lunch/gym in our SF office

Time Off: Flexible PTO, 15 holidays + 2 flex days

Connection: Team outings & referral bonuses

Company

Handshake

Glassdoor3.0

Handshake is a college career network that helps students and recent graduates find their next opportunity.

Founded in 2014

San Francisco, California, USA

501-1000 employees

http://joinhandshake.com

H1B Sponsorship

Handshake has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (24)

2024 (14)

2023 (7)

2022 (15)

2021 (6)

2020 (4)

Funding

Current Stage

Late Stage

Total Funding

$434M

Key Investors

Notable CapitalEQT VenturesSpark Capital

2022-01-19Series F· $200M

2021-05-12Series E· $80M

2020-10-20Series D· $80M

Leadership Team

Garrett Lord

CEO - Co-Founder

Ben Christensen

Co-Founder

Recent News

lsvp.com

Lightspeed Venture Partners portfolio - Handshake(Will Kohler - Partner, James Ephrati - Partner)

2026-01-14

lsvp.com

Lightspeed Venture Partners portfolio -Handshake(Will Kohler - Partner, James Ephrati - Partner)

2026-01-14

Business Insider

The CEO of $2 billion AI training startup says that humans will stay involved in data creation for decades

2026-01-05

Company data provided by crunchbase