Baseten · 2 weeks ago
GPU Kernel Engineer
Baseten is a company that powers mission-critical inference for leading AI companies by providing advanced infrastructure and developer tools. They are seeking a GPU Kernel Engineer to optimize the performance of machine learning models through high-performance GPU kernels, contributing to projects that enhance model performance and efficiency.
AI InfrastructureArtificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
Responsibilities
Design and implement high-performance GPU kernels for key ML operations, including matrix multiplications, attention mechanisms, and mixture-of-experts routing
Write and optimize code using CUDA, PTX assembly, and architecture-specific techniques
Apply advanced performance optimization methods such as memory coalescing, warp-level programming, tensor core acceleration, and compute/memory overlap
Implement cutting-edge features like quantization (FP8/FP4), sparsity, and compute/communication overlap
Identify and resolve performance bottlenecks using tools like Nsight Systems, Nsight Compute, and Torch Profiler
Collaborate with research teams to productionize theoretical advancements
Contribute to internal and open-source GPU libraries
Present technical contributions at industry conferences (e.g., NVIDIA GTC, AWS re:Invent)
Qualification
Required
1–5 years of experience in CUDA development
Strong understanding of GPU architecture and programming paradigms: Memory hierarchy (global, shared, registers, L1/L2 cache), Thread/block/grid organization, Synchronization techniques and race condition mitigation
Proficient in C++ and GPU performance profiling tools
Knowledge of: CUDA C++ API, Memory access patterns and bandwidth optimization, Numerical precision and quantization strategies, Modern GPU features (e.g., tensor cores, async operations)
Preferred
Experience with Transformer models and attention optimization (e.g., Flash Attention)
Familiarity with GPU kernel libraries: Cutlass, Triton, Thrust, CUB
Background in GEMM tuning and distributed/multi-GPU compute
Contributions to open-source GPU projects
Research publications or conference presentations on GPU performance
Benefits
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Company
Baseten
Baseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes.
H1B Sponsorship
Baseten has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (8)
2023 (1)
2020 (1)
Funding
Current Stage
Late StageTotal Funding
$585MKey Investors
BondGreylock
2026-01-20Series Unknown· $300M
2025-09-05Series D· $150M
2025-02-19Series C· $75M
Recent News
2026-01-23
2026-01-23
Company data provided by crunchbase