Member of Technical Staff - Efficient ML jobs in United States
cer-icon
Apply on Employer Site
company-logo

Embedding VC · 1 week ago

Member of Technical Staff - Efficient ML

Embedding-vc is introducing Moonlake, an AI platform for creating world simulations. They are seeking a Member of Technical Staff focused on efficient machine learning, responsible for optimizing training efficiency, GPU performance, inference optimization, and ensuring infrastructure reliability.

Artificial Intelligence (AI)Impact Investing

Responsibilities

Dataloaders, fusion, activation remat, gradient checkpointing
FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning
Nsight profiling, Triton/CUDA kernels, fused ops
Flash-attention–style speedups, sequence packing, KV-cache tricks
Low-latency serving, continuous batching, speculative decoding
Quantization (GPTQ/AWQ), distillation, pruning
SLURM/K8s multi-node jobs, checkpoint hygiene
Determinism, env pinning, GPU failure handling

Qualification

DataloadersGradient checkpointingNsight profilingCUDA kernelsQuantizationSLURMK8sLow-latency servingSoft skills

Required

Experience with dataloaders, fusion, activation remat, gradient checkpointing
Knowledge of FSDP/ZeRO/tensor+pipeline parallel and NCCL tuning
Proficiency in Nsight profiling, Triton/CUDA kernels, and fused ops
Experience with Flash-attention–style speedups, sequence packing, and KV-cache tricks
Skills in low-latency serving, continuous batching, and speculative decoding
Familiarity with quantization techniques (GPTQ/AWQ), distillation, and pruning
Experience with SLURM/K8s multi-node jobs and checkpoint hygiene
Understanding of determinism, environment pinning, and GPU failure handling

Company

Embedding VC

twittertwitter
company-logo
Embedding invests in early-stage Generative AI startups.

Funding

Current Stage
Early Stage

Leadership Team

leader-logo
Roger Jie Luo
Founder & Managing Partner
linkedin
Company data provided by crunchbase