GPU Performance Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Genmo · 6 months ago

GPU Performance Engineer

Genmo is an artificial intelligence creative content generation platform that specializes in developing creative products. They are seeking a GPU Performance Engineer to optimize their model serving stack and achieve significant performance improvements in video generation. The role involves using advanced profiling tools, writing custom CUDA kernels, and collaborating with ML engineers to enhance overall performance.

Artificial Intelligence (AI)ContentDigital Entertainment

Responsibilities

Profile and optimize GPU workloads using Nsight Systems, nvprof, and custom instrumentation
Write high-performance CUDA and Triton kernels for critical model operations
Optimize cold start latency from seconds to milliseconds for our serving infrastructure
Tune memory access patterns, kernel fusion, and GPU utilization
Collaborate with ML engineers to optimize model implementations
Debug performance issues across the full stack from application to hardware
Implement custom memory pooling and allocation strategies
Share optimization techniques and build performance culture across teams

Qualification

GPU optimizationCUDA programmingGPU profiling toolsPythonC++Triton kernel developmentCUTLASSML-specific optimizationsRDMA/InfiniBand optimizationLow-level debuggingGPU architecture knowledge

Required

Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field
5+ years systems programming experience with 3+ years focused on GPU optimization
Expert proficiency with GPU profiling tools (Nsight Systems, nvprof)
Strong CUDA programming skills with production kernel development
Deep understanding of GPU architecture (memory hierarchy, SMs, warps)
Track record of achieving significant performance improvements (5-10x)
Experience with Python and C++ in production environments

Preferred

Experience with Triton kernel development
Knowledge of CUTLASS or similar high-performance libraries
Background in ML-specific optimizations (attention, transformers)
RDMA/InfiniBand optimization experience
Contributions to GPU libraries or frameworks
Low-level debugging skills (PTX/SASS reading)

Company

Genmo

twittertwittertwitter
company-logo
Genmo is an artificial intelligence creative content generation platform that specializes in developing creative products.

Funding

Current Stage
Early Stage
Total Funding
$58.4M
Key Investors
New Enterprise Associates
2024-10-22Series A· $28.4M
2024-02-27Series Unknown· $30M

Leadership Team

leader-logo
Ajay Jain
Co-Founder and CTO
linkedin
Company data provided by crunchbase