Luma AI · 2 weeks ago
Site Reliability Engineer | AI Supercomputing
Luma AI is building the engine for multimodal general intelligence and is seeking a Site Reliability Engineer to architect the physical and digital foundation of AGI. This role involves designing and deploying high-performance supercomputing clusters and optimizing low-level networking for efficient distributed training jobs.
Responsibilities
Design and deploy high-performance clusters combining thousands of GPUs, CPUs, and high-throughput networking to maximize training efficiency
Optimize low-level networking (InfiniBand, RDMA) to ensure seamless communication between accelerators, eliminating bottlenecks in distributed training jobs
Collaborate with hardware partners to push the boundaries of what is possible, debugging failures at the intersection of the kernel, driver, and silicon
Qualification
Required
You possess elite knowledge of high-performance computing (HPC), including job schedulers and the nuances of GPU architecture
You are comfortable navigating the Linux terminal to solve complex performance issues, utilizing tools like perf and strace to optimize at the OS level
You have a history of building infrastructure from the ground up, demonstrating the ability to design systems where no playbook currently exists
Company
Luma AI
Luma AI develops tools that let users generate photorealistic images and videos from text, image, or video prompts.
H1B Sponsorship
Luma AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (10)
2024 (3)
Funding
Current Stage
Growth StageTotal Funding
$1.06BKey Investors
HUMAINAndreessen HorowitzAmplify Partners
2025-11-19Series C· $900M
2024-12-06Series B· $90M
2024-01-09Series B· $43M
Recent News
2026-01-09
2026-01-06
Company data provided by crunchbase