GPU Kernel Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Baseten · 4 hours ago

GPU Kernel Engineer

Baseten powers mission-critical inference for dynamic AI companies and is seeking a GPU Kernel Engineer to enhance the performance of machine learning models. The role focuses on designing high-performance GPU kernels and optimizing computation to support advanced AI applications.

Artificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
check
H1B Sponsor Likelynote

Responsibilities

Design and implement high-performance GPU kernels for key ML operations, including matrix multiplications, attention mechanisms, and mixture-of-experts routing
Write and optimize code using CUDA, PTX assembly, and architecture-specific techniques
Apply advanced performance optimization methods such as memory coalescing, warp-level programming, tensor core acceleration, and compute/memory overlap
Implement cutting-edge features like quantization (FP8/FP4), sparsity, and compute/communication overlap
Identify and resolve performance bottlenecks using tools like Nsight Systems, Nsight Compute, and Torch Profiler
Collaborate with research teams to productionize theoretical advancements
Contribute to internal and open-source GPU libraries
Present technical contributions at industry conferences (e.g., NVIDIA GTC, AWS re:Invent)

Qualification

CUDA developmentGPU architectureC++GPU performance profilingMemory optimizationCollaborationTechnical presentations

Required

1–5 years of experience in CUDA development
Strong understanding of GPU architecture and programming paradigms: Memory hierarchy (global, shared, registers, L1/L2 cache), Thread/block/grid organization, Synchronization techniques and race condition mitigation
Proficient in C++ and GPU performance profiling tools
Knowledge of: CUDA C++ API, Memory access patterns and bandwidth optimization, Numerical precision and quantization strategies, Modern GPU features (e.g., tensor cores, async operations)

Preferred

Experience with Transformer models and attention optimization (e.g., Flash Attention)
Familiarity with GPU kernel libraries: Cutlass, Triton, Thrust, CUB
Background in GEMM tuning and distributed/multi-GPU compute
Contributions to open-source GPU projects
Research publications or conference presentations on GPU performance

Benefits

Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Company

Baseten

twittertwittertwitter
company-logo
Baseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes.

H1B Sponsorship

Baseten has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (8)
2023 (1)
2020 (1)

Funding

Current Stage
Late Stage
Total Funding
$285M
Key Investors
BondGreylock
2025-09-05Series D· $150M
2025-02-19Series C· $75M
2024-03-04Series B· $40M

Leadership Team

leader-logo
Aaron Relph
Design
linkedin
Company data provided by crunchbase