SID.ai · 3 months ago
Research Intern (Summer 2026)
SID.ai is a company focused on training AI to retrieve and reason over data sources, backed by prominent investors including Y Combinator. They are seeking a Research Intern to work on post-training reasoning in LLMs, design RL training environments, and conduct experiments with next-generation models.
Data IntegrationData ManagementData StorageDatabaseGenerative AIInformation TechnologyInfrastructureInternetNatural Language ProcessingSoftware
Responsibilities
Post-train reasoning into LLMs with GRPO and SFT
Design and iterate RL training environments for retrieval – unstructured, structured, web
Run small and large model experiments – yolo runs encouraged
Work on next-generation vision-first embedding models
Qualification
Required
Not afraid of formulas – a technical major is an indicator of this (but isn't the only one)
Thinks they can learn anything in 2 weeks, but isn't arrogant about it
Prefers .py to .tex
Familiar with RL pipelines for language models
Comfortable with torchrun/accelerate/multi-node training
Clever about getting the data needed – or synthetically generating it
Finds easy solutions to hard problems, but doesn't mind getting their hands dirty, i.e., jumping a layer down into PyTorch or CUDA
Familiar with 'You and Your Research.' Understands what it takes to do significant work
Must articulate ideas well! A big part of making successful models is telling people about them. This includes writing docs and technical reports at the minimum – and jumping on podcasts at the extreme
Benefits
Work on frontier methods that scale. No weird old-school AI.
Everyone on the team can code – this might change in the future of course.
Company
SID.ai
SID is an AI research lab based in San Francisco. We train models that can retrieve and reason over any data source.