Embedding VC · 4 hours ago
Member of Technical Staff - Efficient ML
Embedding VC is introducing Moonlake, an AI platform for creating world simulations. They are seeking a Member of Technical Staff to focus on training efficiency, GPU performance, inference optimization, and infrastructure reliability.
Artificial Intelligence (AI)Impact Investing
Responsibilities
Dataloaders, fusion, activation remat, gradient checkpointing
FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning
Nsight profiling, Triton/CUDA kernels, fused ops
Flash-attention–style speedups, sequence packing, KV-cache tricks
Low-latency serving, continuous batching, speculative decoding
Quantization (GPTQ/AWQ), distillation, pruning
SLURM/K8s multi-node jobs, checkpoint hygiene
Determinism, env pinning, GPU failure handling
Qualification
Required
Dataloaders, fusion, activation remat, gradient checkpointing
FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning
Nsight profiling, Triton/CUDA kernels, fused ops
Flash-attention–style speedups, sequence packing, KV-cache tricks
Low-latency serving, continuous batching, speculative decoding
Quantization (GPTQ/AWQ), distillation, pruning
SLURM/K8s multi-node jobs, checkpoint hygiene
Determinism, env pinning, GPU failure handling