Apply on Employer Site

AMD · 1 day ago

Senior Software Development Engineer - SGLang

Santa Clara, CA

Full-time

Onsite

Senior Level

$192K/yr - $288K/yr

AMD is a company dedicated to building innovative products that enhance next-generation computing experiences. The role involves optimizing and developing deep learning frameworks for AMD GPUs, focusing on enhancing GPU kernel performance and collaborating with internal teams and open-source communities to drive contributions to AMD’s AI software ecosystem.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Optimize Deep Learning Frameworks: Enhance performance of frameworks like TensorFlow, PyTorch, and SGLang on AMD GPUs via upstream contributions in open-source repositories

Develop and Optimize Deep Learning Models: Profile and tune large-scale training and inference models for optimal performance on AMD hardware

GPU Kernel Development: Design, implement, and optimize high-performance GPU kernels using HIP, Triton, or other relevant tools for AI operator efficiency

Collaborate with GPU Library and Compiler Teams: Work closely with internal compiler and GPU math library teams to integrate and align kernel-level optimizations with full-stack performance goals

Contribute to SGLang Development: Support optimization, feature development, and scaling of the SGLang LLM framework across AMD GPU platforms

Distributed System Optimization: Tune and scale performance across both multi-GPU (scale-up) and multi-node (scale-out) environments, including inference parallelism and collective communication strategies

Graph Compiler Integration: Integrate and optimize runtime execution through graph compilers such as XLA, TorchDynamo, or custom pipelines

Open-Source Collaboration: Partner with external maintainers to understand framework needs, propose optimizations, and upstream contributions effectively

Apply Engineering Best Practices: Leverage modern software engineering practices in debugging, profiling, test-driven development, and CI integration

Qualification

C++Deep Learning FrameworksGPU Kernel DevelopmentSGLangDistributed SystemsPythonCompiler KnowledgeGPU ComputingSoftware Engineering Best PracticesProblem-SolvingCollaboration

Required

Strong technical and analytical expertise in C++ development within Linux environments

Ability to thrive in both collaborative team settings and independent work

Ability to define goals, manage development efforts, and deliver high-quality solutions

Strong problem-solving skills

Proactive approach

Keen understanding of software engineering best practices

Optimize Deep Learning Frameworks: Enhance performance of frameworks like TensorFlow, PyTorch, and SGLang on AMD GPUs via upstream contributions in open-source repositories

Develop and Optimize Deep Learning Models: Profile and tune large-scale training and inference models for optimal performance on AMD hardware

GPU Kernel Development: Design, implement, and optimize high-performance GPU kernels using HIP, Triton, or other relevant tools for AI operator efficiency

Collaborate with GPU Library and Compiler Teams: Work closely with internal compiler and GPU math library teams to integrate and align kernel-level optimizations with full-stack performance goals

Contribute to SGLang Development: Support optimization, feature development, and scaling of the SGLang LLM framework across AMD GPU platforms

Distributed System Optimization: Tune and scale performance across both multi-GPU (scale-up) and multi-node (scale-out) environments, including inference parallelism and collective communication strategies

Graph Compiler Integration: Integrate and optimize runtime execution through graph compilers such as XLA, TorchDynamo, or custom pipelines

Open-Source Collaboration: Partner with external maintainers to understand framework needs, propose optimizations, and upstream contributions effectively

Apply Engineering Best Practices: Leverage modern software engineering practices in debugging, profiling, test-driven development, and CI integration

Bachelor's and/or Master's Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field

Preferred

Expert in C++ and/or Python, with demonstrated ability to debug, profile, and optimize performance-critical code

Solid hands-on experience with SGLang or similar LLM inference frameworks

Background in compiler design or familiarity with technologies like LLVM, MLIR, or ROCm

Expert experience running and scaling workloads on large-scale, heterogeneous clusters (CPU + GPU) using distributed training or inference strategies

Strong experience and contribution to or integrating optimizations into deep learning frameworks such as PyTorch or TensorFlow

Solid and strong knowledge of HIP, CUDA, or other GPU programming models; experience with GCN/CDNA architecture

Benefits

AMD benefits at a glance.

Company

AMD

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Founded in 1969

Santa Clara, California, USA

10001+ employees

http://www.amd.com

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (836)

2024 (770)

2023 (551)

2022 (739)

2021 (519)

2020 (547)

Funding

Current Stage

Public Company

Total Funding

unknown

Key Investors

OpenAIDaniel Loeb

2025-10-06Post Ipo Equity

2023-03-02Post Ipo Equity

2021-06-29Post Ipo Equity

Leadership Team

Lisa Su

Chair & CEO

Mark Papermaster

CTO and EVP

Recent News

Livemint.com

Physical AI dominates CES but humanity will still have to wait a while for humanoid servants

2026-01-09

GlobeNewswire

KunlunMeta Partners with AMD to Shine at CES

2026-01-09

The Register

AMD boasts 1000x higher AI perf by 2027 and pulls the lid off Helios compute tray ahead of 2H 2026 launch

2026-01-08

Company data provided by crunchbase