Research Intern (Summer 2026) jobs in United States
cer-icon
Apply on Employer Site
company-logo

SID.ai · 3 months ago

Research Intern (Summer 2026)

SID.ai is a company focused on training AI to retrieve and reason over data sources, backed by prominent investors including Y Combinator. They are seeking a Research Intern to work on post-training reasoning in LLMs, design RL training environments, and conduct experiments with next-generation models.

Data IntegrationData ManagementData StorageDatabaseGenerative AIInformation TechnologyInfrastructureInternetNatural Language ProcessingSoftware

Responsibilities

Post-train reasoning into LLMs with GRPO and SFT
Design and iterate RL training environments for retrieval – unstructured, structured, web
Run small and large model experiments – yolo runs encouraged
Work on next-generation vision-first embedding models

Qualification

Torch/PyTorchReinforcement learning (RL)RL pipelines for language modelsTorchrun/accelerate/multi-node trainingTechnical writing

Required

Not afraid of formulas – a technical major is an indicator of this (but isn't the only one)
Thinks they can learn anything in 2 weeks, but isn't arrogant about it
Prefers .py to .tex
Familiar with RL pipelines for language models
Comfortable with torchrun/accelerate/multi-node training
Clever about getting the data needed – or synthetically generating it
Finds easy solutions to hard problems, but doesn't mind getting their hands dirty, i.e., jumping a layer down into PyTorch or CUDA
Familiar with 'You and Your Research.' Understands what it takes to do significant work
Must articulate ideas well! A big part of making successful models is telling people about them. This includes writing docs and technical reports at the minimum – and jumping on podcasts at the extreme

Benefits

Work on frontier methods that scale. No weird old-school AI.
Everyone on the team can code – this might change in the future of course.

Company

SID.ai

twittertwittertwitter
company-logo
SID is an AI research lab based in San Francisco. We train models that can retrieve and reason over any data source.

Funding

Current Stage
Early Stage
Total Funding
$0.5M
Key Investors
Y Combinator
2023-05-08Pre Seed· $0.5M

Leadership Team

leader-logo
Lukas Ruflair
Chief Information Officer
linkedin
Company data provided by crunchbase