Together AI · 3 days ago
Research Intern RL & Post-Training Systems, Turbo (Summer 2026)
Together AI is a research-driven artificial intelligence company focused on creating innovative AI systems. The role involves investigating post-training and reinforcement learning for large language models, emphasizing efficient and scalable methods while co-designing algorithms and systems.
AI InfrastructureArtificial Intelligence (AI)Generative AIInternetIT InfrastructureOpen Source
Responsibilities
Study RL and post-training methods whose performance and scalability are tightly coupled to inference behavior
Co-design algorithms and systems rather than treating them independently
Unlock new regimes of experimentation—larger models, longer rollouts, and more complex evaluations—by rethinking how inference, scheduling, and training interact
Design RL or preference-optimization objectives that explicitly account for inference cost and structure
Study how inference-time approximations affect learning dynamics in GRPO-, RLHF-, RLAIF-, or DPO-style methods
Analyze bias, variance, and stability trade-offs introduced by accelerated inference within RL loops
Develop inference mechanisms that support deterministic, reproducible RL rollouts at scale
Explore batching, scheduling, and memory-management strategies optimized for RL and evaluation workloads rather than pure serving
Investigate how KV-cache policies, sampling controls, or runtime abstractions influence learning efficiency
Empirically characterize how reward improvement and generalization scale with rollout cost, latency, and throughput
Quantify when systems-level optimizations change algorithmic behavior rather than only reducing runtime
Identify regimes where inference efficiency unlocks qualitatively new learning capabilities
Design rigorous benchmarks and diagnostics for post-training and RL efficiency
Study failure modes in long-horizon training and how system constraints shape outcomes
Qualification
Required
Are pursuing a PhD or MS in Computer Science, EE, or a related field (exceptional undergraduates considered)
Have research experience in one or more of: RL or post-training for large models (e.g., RLHF, RLAIF, GRPO, preference optimization)
Have research experience in one or more of: ML systems (inference engines, runtimes, distributed systems)
Have research experience in one or more of: Large-scale empirical ML research or evaluation
Are comfortable with empirical research: Designing controlled experiments and ablations
Are comfortable with empirical research: Interpreting noisy results and drawing principled conclusions
Can work across abstraction layers: Strong Python skills for experimentation
Can work across abstraction layers: Willingness to modify inference or training systems (experience with C++, CUDA, or similar is a plus)
Care about research insight, not just benchmarks: You ask why methods work or fail under real system constraints
Care about research insight, not just benchmarks: You think about how infrastructure assumptions shape algorithmic outcomes
Preferred
Prior research experience with foundation models or efficient machine learning
Publications at leading ML and NLP conferences (such as NeurIPS, ICML, ICLR, ACL, or EMNLP)
Understanding of model optimization techniques and hardware acceleration approaches
Contributions to open-source machine learning projects
Benefits
Housing stipends
Other competitive benefits
Company
Together AI
Together AI is a cloud-based platform designed for constructing open-source generative AI and infrastructure for developing AI models.
H1B Sponsorship
Together AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (19)
2024 (6)
2023 (3)
Funding
Current Stage
Growth StageTotal Funding
$533.5MKey Investors
Salesforce VenturesLux Capital
2025-02-20Series B· $305M
2024-03-13Series A· $106M
2023-11-29Series A· $102.5M
Leadership Team
Recent News
2025-11-27
Company data provided by crunchbase