GitHub · 1 day ago
Staff Applied Researcher, AI Quality
GitHub is the world’s leading platform for agentic software development, and they are seeking a Staff Applied Researcher with deep expertise in AI evaluation and engineering. The role focuses on designing evaluation systems for AI models, influencing product decisions, and collaborating with engineering teams to enhance GitHub Copilot and AI features.
Artificial Intelligence (AI)Cloud ComputingDeveloper ToolsInternetProject ManagementSaaSSoftware
Responsibilities
Design next‑generation evaluation frameworks for code generation, reasoning, safety, multimodal tasks, and agentic workflows
Develop scalable automatic metrics, LLM‑judge systems, reward models, and human‑in‑the‑loop evaluation pipelines
Establish high‑signal, repeatable methodologies that influence product decisions across GitHub AI
Build and optimize evaluation tooling, datasets, benchmarking systems, and experimentation pipelines
Create and onboard new benchmarks for the hardest tasks for the coding agents
Collaborate closely with engineering teams to productionize research, validate improvements, and accelerate model iteration cycles
Own end‑to‑end quality insights for the models behind GitHub Copilot and new AI features
Work closely with product development, engineering, and design teams to integrate advanced research findings into practical applications, ensuring alignment with product goals and user needs
Shape GitHub’s strategy for model quality, alignment, and evaluation
Mentor other researchers and engineers, helping elevate technical standards across the organization
Drive clarity in ambiguous problem spaces and champion fast, high‑quality execution
Qualification
Required
Bachelor's degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 8+ years' experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR master's degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 6+ years' experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR doctorate in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 4+ years' experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field
OR equivalent experience
3+ years of strong engineering skills in Python/Typescript and experience building production grade evaluation or data/ML pipelines at scale
Proven track record shipping research or evaluation systems in production environments
Strong cross‑functional communication and influence skills
Preferred
Experience with LLM judge systems, reward modeling, alignment, or safety evaluations
Background in code generation, developer tools, or AI‑assisted programming
Experience with large‑scale experimentation and online/offline evaluation strategies
Open‑source contributions or experience working with developer communities
Experience designing and leading complex research projects from ideation to implementation
Ability to define and articulate data-driven strategies that consider cross-functional impacts and align with organizational priorities, particularly in a software development platform context
Benefits
Annual bonus
Stock
Competitive pay
Generous learning and growth opportunities
Excellent benefits
Company
GitHub
GitHub is a software company that offers code hosting services that allow developers to build software for open-source and private projects. It is a sub-organization of Microsoft.
H1B Sponsorship
GitHub has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (26)
2024 (17)
2023 (14)
2022 (20)
2021 (20)
2020 (10)
Funding
Current Stage
Late StageTotal Funding
$350MKey Investors
Sequoia CapitalAndreessen Horowitz
2018-06-03Acquired
2015-07-29Series B· $250M
2015-06-19Secondary Market
Recent News
The French Tech Journal
2026-01-08
Company data provided by crunchbase