Distributed Training Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Periodic Labs · 3 months ago

Distributed Training Engineer

Periodic Labs is an AI and physical sciences lab focused on building state-of-the-art models for scientific discoveries. The role involves optimizing and developing large-scale distributed LLM training systems to support AI scientific research and collaborating with researchers on various workflows and experiments.

Artificial Intelligence (AI)Foundational AIGenerative AIMachine Learning
check
H1B Sponsor Likelynote

Responsibilities

Optimize, operate and develop large-scale distributed LLM training systems
Work closely with researchers to bring up, debug, and maintain mid-training and reinforcement learning workflows
Build tools and directly support frontier-scale experiments to make Periodic Labs the world’s best AI + science lab for physicists, computational materials scientists, AI researchers, and engineers
Contribute open-source large scale LLM training frameworks

Qualification

Distributed training frameworksLarge-scale LLM trainingTraining on clustersOptimizing training throughputDebugging workflows

Required

Training on clusters with ≥5,000 GPUs
5D parallel LLM training
Distributed training frameworks such as Megatron-LM, FSDP, DeepSpeed, TorchTitan
Optimizing training throughput for large scale Mixture-of-Expert models

Company

Periodic Labs

twittertwittertwitter
company-logo
Periodic Labs develops artificial intelligence systems that simulate and predict the properties of materials using machine learning.

H1B Sponsorship

Periodic Labs has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)

Funding

Current Stage
Early Stage
Total Funding
$300M
2025-09-30Seed· $300M
Company data provided by crunchbase