Principal GPU Performance Engineer - Artificial Intelligence jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD · 8 hours ago

Principal GPU Performance Engineer - Artificial Intelligence

AMD is a leading company in computing technologies, dedicated to building innovative products that enhance computing experiences. The Principal GPU Performance Engineer will optimize AI training and inference workloads while collaborating with various teams to improve the efficiency of AMD's GPU architectures.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
badNo H1Bnote

Responsibilities

Profile and optimize large-scale AI training and inference workloads (transformers, multimodal, diffusion, recommender systems) across multi-node, multi-GPU clusters
Identify bottlenecks in compute, memory, interconnects, and communication libraries (NCCL/RCCL, MPI), and deliver optimizations to maximize scaling efficiency
Collaborate with compiler/runtime teams to improve kernel performance, scheduling, and memory utilization
Develop, maintain and recommend benchmarks representative of foundation model AI training and inference workloads
Provide performance insights to AMD Instinct GPU architecture teams, informing hardware/software co-design decisions for future architectures
Partner with framework teams (PyTorch, JAX, TensorFlow) to upstream performance improvements and enable better scaling APIs
Present findings to cross-functional teams and leadership, shaping both software and hardware roadmaps

Qualification

GPU tuningOptimizationDistributed training frameworksPerformance analysis/profilingPythonC++Large-scale AI infrastructureAdvanced Linux OSCuriosityInnovationCollaboration skills

Required

Profile and optimize large-scale AI training and inference workloads (transformers, multimodal, diffusion, recommender systems) across multi-node, multi-GPU clusters
Identify bottlenecks in compute, memory, interconnects, and communication libraries (NCCL/RCCL, MPI), and deliver optimizations to maximize scaling efficiency
Collaborate with compiler/runtime teams to improve kernel performance, scheduling, and memory utilization
Develop, maintain and recommend benchmarks representative of foundation model AI training and inference workloads
Provide performance insights to AMD Instinct GPU architecture teams, informing hardware/software co-design decisions for future architectures
Partner with framework teams (PyTorch, JAX, TensorFlow) to upstream performance improvements and enable better scaling APIs
Present findings to cross-functional teams and leadership, shaping both software and hardware roadmaps
Strong expertise in GPU tuning and optimization (CUDA, ROCm, or equivalent)
Understanding of GPU microarchitecture (execution units, memory hierarchy, interconnects, warp scheduling)
Hands-on experience with distributed training and inference frameworks and communication libraries (e.g., PyTorch DDP, DeepSpeed, Megatron-LM, NCCL/RCCL, MPI)
Advanced Linux OS, container (e.g. Docker) and GitHub skills
Proficiency in Python or C++ for performance-critical development
Familiarity with large-scale AI training and inference infrastructure (NVLink, InfiniBand, PCIe, cloud/HPC clusters)
Experience in benchmarking methodologies, performance analysis/profiling (e.g. Nsight), performance monitoring tools
Experience scaling training to thousands of GPUs for foundation models a plus
Strong track record of optimizing large-scale AI systems or HPC environments is desired
Master's or PhD degree in Computer Science or Computer Engineering

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase