GPU Performance Software Development Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Advanced Microdevices Pvt. Ltd. (India) · 11 hours ago

GPU Performance Software Development Engineer

Advanced Micro Devices, Inc is dedicated to building innovative products that enhance computing experiences across various domains. The GPU Performance Software Development Engineer will be responsible for optimizing GPU kernels, ensuring peak performance, and bridging high-level representations to low-level code generation.

BiopharmaBiotechnologyIndustrialManufacturing

Responsibilities

Own kernel performance for Wave
Optimize critical kernels (GEMM, Attention, MoE, decoding) to be competitive with or exceed vendor libraries
Profile, analyze, and eliminate bottlenecks across memory, registers, instruction scheduling, and wave/warp execution
Low-level GPU optimization
Write and tune kernels using HIP / CUDA / inline assembly / intrinsics (e.g., MFMA / MMA)
Optimize LDS/shared memory usage, register allocation, instruction scheduling, occupancy, and wave/warp utilization
Reason about hardware details such as waves/warps, WGP/SM behavior, pipelines, cache hierarchies, and memory systems
Extend and optimize MLIR dialects and lowering pipelines relevant to GPU code generation
Bridge high-level representations (FX / Python DSL) to low-level MLIR and ISA-aware transformations
Implement compiler passes for tiling, vectorization, prefetching, pipelining, and layout transformations
Build mental and empirical performance models to guide kernel design
Use profiling tools (e.g., rocprof, Nsight, custom counters) and disassembly to validate hypotheses
Create internal benchmarks, microkernels, and performance regression tests
Lead kernel and compiler optimization for new GPU architectures
Adapt kernels and compiler strategies to evolving hardware capabilities

Qualification

GPU performance expertiseC++ programmingGPU programming (HIP/CUDA)Compiler experiencePerformance analysisLow-level programmingAMD GPUsNVIDIA GPUsBackground in linear algebraPhD in Computer Science

Required

Deep GPU performance expertise
Proven experience optimizing GPU kernels at the instruction and memory-system level
Strong understanding of GPU execution models (waves/warps, occupancy, latency hiding)
Proficiency in C++ and GPU programming (HIP or CUDA)
Experience with GPU intrinsics, inline PTX / GCN assembly, or equivalent low-level code
Hands-on experience with compilers, preferably MLIR
Familiarity with compiler IRs, lowering pipelines, and performance-critical transformations
Ability to read disassembly, analyze performance counters, and reason from first principles
Track record of closing performance gaps against strong baselines
Masters in Computer Science or related field

Preferred

Experience with AMD GPUs (ROCm, CDNA, MI-series) or NVIDIA GPUs (Ampere/Hopper/Blackwell)
Experience designing or maintaining a DSL, compiler backend, or GPU codegen pipeline
Background in linear algebra kernels, attention mechanisms, or ML workloads
Comfort working across Python frontends, MLIR, and backend codegen
PhD in Computer Science or related field

Benefits

AMD benefits at a glance.

Company

Advanced Microdevices Pvt. Ltd. (India)

twittertwittertwitter
company-logo
Advanced Microdevices (mdi) is a leader in innovative membrane technologies.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Nalini Kant Gupta
Founder & Managing Director
Company data provided by crunchbase