Apply on Employer Site

AMD · 2 weeks ago

Distributed Inferencing Software Engineer - AI Models

Austin, TX

Full-time

Onsite

Mid, Senior Level

$143K/yr - $215K/yr

AMD is a company focused on building innovative products that advance computing experiences across various domains including AI and data centers. The role involves working as a software engineer to enhance distributed inferencing on AMD GPUs while collaborating with a talented team to optimize performance for AI applications.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Enable, benchmark AI models on distributed systems

Work in a distributed computing setting to optimize for both scale-up (multi-GPU) / scale-out (multi-node) / scale-across systems

Collaborate and interact with internal GPU library teams to analyze and optimize distributed workloads for high throughput/low latency

Expertise on parallelization strategies for AI workloads - and application for best performance for each configuration

Contribute to distributed model management, model zoos, monitoring, benchmarking and documentation

Qualification

C++/Python AI developmentGPU computingAI framework engineeringParallelization strategiesPerformance analysisCluster orchestration softwareGoal definitionTeam collaborationIndependent work

Required

Strong technical and analytical skills in C++/Python AI development, solving performance and investigating scalability on multi-GPU, multi-node clusters

Ability to work as part of a team, while also being able to work independently, define goals and scope and lead your own development effort

Enable, benchmark AI models on distributed systems

Work in a distributed computing setting to optimize for both scale-up (multi-GPU) / scale-out (multi-node) / scale-across systems

Collaborate and interact with internal GPU library teams to analyze and optimize distributed workloads for high throughput/low latency

Expertise on parallelization strategies for AI workloads - and application for best performance for each configuration

Contribute to distributed model management, model zoos, monitoring, benchmarking and documentation

Masters or PhD or equivalent experience in Computer Science, Computer Engineering, or related field

Preferred

Knowledge of GPU computing (HIP, CUDA, OpenCL)

AI framework engineering experience (vLLM, SGLang, Llama.cpp)

Understanding of KV cache transfer mechanisms, options (Mooncake, NIXL/RIXL) and Expert Parallelization (DeepEP/MORI/PPLX-Garden)

Excellent C/C++/Python programming and software design skills, including debugging, performance analysis, and test design

Experiences to run workloads, especially AI models, on large scale heterogeneous cluster

Familiarity with clusters and orchestration software (SLURM, K8s)

Benefits

AMD benefits at a glance.

Company

AMD

Glassdoor4.1

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Founded in 1969

Santa Clara, California, USA

10001+ employees

http://www.amd.com

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (836)

2024 (770)

2023 (551)

2022 (739)

2021 (519)

2020 (547)

Funding

Current Stage

Public Company

Total Funding

unknown

Key Investors

OpenAIDaniel Loeb

2025-10-06Post Ipo Equity

2023-03-02Post Ipo Equity

2021-06-29Post Ipo Equity

Leadership Team

Lisa Su

Chair & CEO

Mark Papermaster

CTO and EVP

Recent News

Livemint.com

Physical AI dominates CES but humanity will still have to wait a while for humanoid servants

2026-01-09

GlobeNewswire

KunlunMeta Partners with AMD to Shine at CES

2026-01-09

The Register

AMD boasts 1000x higher AI perf by 2027 and pulls the lid off Helios compute tray ahead of 2H 2026 launch

2026-01-08

Company data provided by crunchbase