AMD ยท 1 week ago
Principal GPU Performance Engineer - Artificial Intelligence
AMD is a company that focuses on building products that accelerate next-generation computing experiences. They are seeking a Principal GPU Performance Engineer to optimize AI training and inference workloads and guide the evolution of next-generation AMD Instinct GPU architectures.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
Responsibilities
Profile and optimize large-scale AI training and inference workloads (transformers, multimodal, diffusion, recommender systems) across multi-node, multi-GPU clusters
Identify bottlenecks in compute, memory, interconnects, and communication libraries (NCCL/RCCL, MPI), and deliver optimizations to maximize scaling efficiency
Collaborate with compiler/runtime teams to improve kernel performance, scheduling, and memory utilization
Develop, maintain and recommend benchmarks representative of foundation model AI training and inference workloads
Provide performance insights to AMD Instinct GPU architecture teams, informing hardware/software co-design decisions for future architectures
Partner with framework teams (PyTorch, JAX, TensorFlow) to upstream performance improvements and enable better scaling APIs
Present findings to cross-functional teams and leadership, shaping both software and hardware roadmaps
Qualification
Required
Master's or PhD degree in Computer Science or Computer Engineering
Profile and optimize large-scale AI training and inference workloads (transformers, multimodal, diffusion, recommender systems) across multi-node, multi-GPU clusters
Identify bottlenecks in compute, memory, interconnects, and communication libraries (NCCL/RCCL, MPI), and deliver optimizations to maximize scaling efficiency
Collaborate with compiler/runtime teams to improve kernel performance, scheduling, and memory utilization
Develop, maintain and recommend benchmarks representative of foundation model AI training and inference workloads
Provide performance insights to AMD Instinct GPU architecture teams, informing hardware/software co-design decisions for future architectures
Partner with framework teams (PyTorch, JAX, TensorFlow) to upstream performance improvements and enable better scaling APIs
Present findings to cross-functional teams and leadership, shaping both software and hardware roadmaps
Preferred
Strong expertise in GPU tuning and optimization (CUDA, ROCm, or equivalent)
Understanding of GPU microarchitecture (execution units, memory hierarchy, interconnects, warp scheduling)
Hands-on experience with distributed training and inference frameworks and communication libraries (e.g., PyTorch DDP, DeepSpeed, Megatron-LM, NCCL/RCCL, MPI)
Advanced Linux OS, container (e.g. Docker) and GitHub skills
Proficiency in Python or C++ for performance-critical development
Familiarity with large-scale AI training and inference infrastructure (NVLink, InfiniBand, PCIe, cloud/HPC clusters)
Experience in benchmarking methodologies, performance analysis/profiling (e.g. Nsight), performance monitoring tools
Experience scaling training to thousands of GPUs for foundation models a plus
Strong track record of optimizing large-scale AI systems in cloud or HPC environments is desired
Benefits
AMD benefits at a glance.
Company
AMD
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.
H1B Sponsorship
AMD has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity
Recent News
Morningstar.com
2026-01-11
Company data provided by crunchbase