GPU Kernel Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Baseten · 6 months ago

GPU Kernel Engineer

Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast. They are seeking a GPU Kernel Engineer to optimize GPU performance for machine learning models, contributing to high-impact systems work and collaborating with research teams.

AI InfrastructureArtificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
check
H1B Sponsor Likelynote

Responsibilities

Design and implement high-performance GPU kernels for key ML operations, including matrix multiplications, attention mechanisms, and mixture-of-experts routing
Write and optimize code using CUDA, PTX assembly, and architecture-specific techniques
Apply advanced performance optimization methods such as memory coalescing, warp-level programming, tensor core acceleration, and compute/memory overlap
Implement cutting-edge features like quantization (FP8/FP4), sparsity, and compute/communication overlap
Identify and resolve performance bottlenecks using tools like Nsight Systems, Nsight Compute, and Torch Profiler
Collaborate with research teams to productionize theoretical advancements
Contribute to internal and open-source GPU libraries
Present technical contributions at industry conferences (e.g., NVIDIA GTC, AWS re:Invent)

Qualification

CUDA developmentGPU architectureC++GPU performance profilingMemory optimizationCollaborationTechnical presentations

Required

1–5 years of experience in CUDA development
Strong understanding of GPU architecture and programming paradigms: Memory hierarchy (global, shared, registers, L1/L2 cache), Thread/block/grid organization, Synchronization techniques and race condition mitigation
Proficient in C++ and GPU performance profiling tools
Knowledge of: CUDA C++ API, Memory access patterns and bandwidth optimization, Numerical precision and quantization strategies, Modern GPU features (e.g., tensor cores, async operations)

Preferred

Experience with Transformer models and attention optimization (e.g., Flash Attention)
Familiarity with GPU kernel libraries: Cutlass, Triton, Thrust, CUB
Background in GEMM tuning and distributed/multi-GPU compute
Contributions to open-source GPU projects
Research publications or conference presentations on GPU performance

Benefits

Flexible PTO
401k
Covered healthcare premiums

Company

Baseten

twittertwittertwitter
company-logo
Baseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes.

H1B Sponsorship

Baseten has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (8)
2023 (1)
2020 (1)

Funding

Current Stage
Late Stage
Total Funding
$285M
Key Investors
BondGreylock
2025-09-05Series D· $150M
2025-02-19Series C· $75M
2024-03-04Series B· $40M

Leadership Team

leader-logo
Aaron Relph
Design
linkedin
Company data provided by crunchbase