AMD · 1 week ago
Senior Software Development Engineer - LLM Kernel & Inference Systems
AMD is a leading company focused on building innovative products that enhance computing experiences across various domains. The Senior Software Development Engineer will lead technical efforts in LLM inference and kernel optimization for AMD GPUs, ensuring high-performance serving and collaborating with internal and open-source teams to drive significant performance improvements.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
Responsibilities
Optimize LLM Inference Frameworks Drive performance improvements in LLM inference frameworks such as vLLM, SGLang, and PyTorch for AMD GPUs, contributing both internally and upstream
LLM-Aware Kernel Development Design and optimize GPU kernels critical to LLM inference, including attention, GEMMs, KV cache operations, MoE components, and memory-bound kernels
Distributed LLM Inference at Scale Design, implement, and tune multi-GPU and multi-node inference strategies, including TP / PP / EP hybrids, continuous batching, KV cache management, and disaggregated serving
Model-System Co-Design Collaborate with model and framework teams to align LLM architectures with hardware-aware optimizations, improving real-world inference efficiency
Compiler & Runtime Optimization Leverage compiler technologies (LLVM, ROCm, Triton, graph compilers) to improve kernel fusion, memory access patterns, and end-to-end inference pipelines
End-to-End Inference Pipeline Optimization Optimize the full inference stack—from model execution graphs and runtimes to scheduling, batching, and deployment
Open-Source Leadership Engage with open-source maintainers to upstream optimizations, influence roadmap direction, and ensure long-term sustainability of contributions
Engineering Excellence Apply best practices in software engineering, including performance benchmarking, testing, debugging, and maintainability at scale
Qualification
Required
Deep LLM domain knowledge
Strong understanding of end-to-end inference systems
Ability to reason about attention, KV cache, batching, parallelism strategies, and their mapping to GPU kernels and hardware characteristics
Ability to thrive in ambiguous problem spaces
Ability to independently define technical direction
Ability to consistently deliver measurable performance gains
Balance strong execution with thoughtful upstream collaboration
Maintain a high bar for software quality
Optimize LLM Inference Frameworks
Design and optimize GPU kernels critical to LLM inference
Design, implement, and tune multi-GPU and multi-node inference strategies
Collaborate with model and framework teams for hardware-aware optimizations
Leverage compiler technologies to improve kernel fusion and memory access patterns
Optimize the full inference stack
Engage with open-source maintainers to upstream optimizations
Apply best practices in software engineering
Preferred
Deep understanding of Large Language Model inference, including attention mechanisms, KV cache behavior, batching strategies, and latency/throughput trade-offs
Hands-on experience with vLLM, SGLang, or similar inference systems
Proven experience optimizing GPU kernels for deep learning workloads
Experience designing and tuning large-scale inference systems across multiple GPUs and nodes
Track record of meaningful upstream contributions to ML, LLM, or systems-level open-source projects
Strong proficiency in Python and C++
Deep experience in performance analysis, profiling, and debugging complex systems
Experience running and optimizing large-scale workloads on heterogeneous GPU clusters
Solid foundation in compiler concepts and tooling
Master's or PhD in Computer Science, Computer Engineering, Electrical Engineering, or a related field
Benefits
AMD benefits at a glance.
Company
AMD
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.
H1B Sponsorship
AMD has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity
Recent News
GlobeNewswire
2026-01-21
2026-01-20
2026-01-19
Company data provided by crunchbase