Periodic Labs · 3 months ago
Distributed Training Engineer
Periodic Labs is an AI and physical sciences lab focused on building state-of-the-art models for scientific discoveries. The role involves optimizing and developing large-scale distributed LLM training systems to support AI scientific research and collaborating with researchers on various workflows and experiments.
Artificial Intelligence (AI)Foundational AIGenerative AIMachine Learning
Responsibilities
Optimize, operate and develop large-scale distributed LLM training systems
Work closely with researchers to bring up, debug, and maintain mid-training and reinforcement learning workflows
Build tools and directly support frontier-scale experiments to make Periodic Labs the world’s best AI + science lab for physicists, computational materials scientists, AI researchers, and engineers
Contribute open-source large scale LLM training frameworks
Qualification
Required
Training on clusters with ≥5,000 GPUs
5D parallel LLM training
Distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, TorchTitan
Optimizing training throughput for large scale Mixture-of-Expert models
Company
Periodic Labs
Periodic Labs develops artificial intelligence systems that simulate and predict the properties of materials using machine learning.
H1B Sponsorship
Periodic Labs has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
Funding
Current Stage
Early StageTotal Funding
$300M2025-09-30Seed· $300M
Recent News
Business Insider
2025-12-26
Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
2025-12-20
Company data provided by crunchbase