Network System Design Engineer - Data Center GPU jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD · 5 hours ago

Network System Design Engineer - Data Center GPU

AMD is a leading company focused on building innovative products that enhance computing experiences across various domains. They are seeking a senior engineer to join their Datacenter Graphics and Accelerated Computing team, responsible for debugging complex network systems and driving quality initiatives in datacenter environments.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
badNo H1Bnote

Responsibilities

A powerful desire to learn new skills and understand new features as they are added
Proven record of accomplishment of working within and across groups
Effective communication skills
Responsible for exploring opportunities to improve product
Work closely with other team members to understand design architecture and to propose solutions to improve and enhance products
Debug / triage engineer for a new quality initiative
Understanding of GPU/System level HW and SW flow
Provide leadership for driving to root cause issues / bugs
Communicate / Document flows and methods of debug ability
Embedded coding for hardware components and respective drivers for network components
Assist with network prototypes and in-depth testing to validate the design
Formulate and define platform level validation test plans based on product/customer needs
Troubleshoot and resolve platform network issues
Provide customer support regarding network architectural questions, product prerequisites, and product features
Interface with networking partners and software/hardware engineers
Work with software developers on network performance enhancement

Qualification

System/SOC level debugNetwork technologiesGPU/System level flowEmbedded codingLinux Operating SystemNetworking standardsMesh network protocolsHPC/ML/DL workloadsEffective communicationProblem-solvingAttention to detailCritical thinking

Required

Proven record of accomplishment of working within and across groups
Effective communication skills
Responsible for exploring opportunities to improve product
Work closely with other team members to understand design architecture and to propose solutions to improve and enhance products
Debug / triage engineer for a new quality initiative
Understanding of GPU/System level HW and SW flow
Provide leadership for driving to root cause issues / bugs
Communicate / Document flows and methods of debug ability
Embedded coding for hardware components and respective drivers for network components
Assist with network prototypes and in-depth testing to validate the design
Formulate and define platform level validation test plans based on product/customer needs
Troubleshoot and resolve platform network issues
Provide customer support regarding network architectural questions, product prerequisites, and product features
Interface with networking partners and software/hardware engineers
Work with software developers on network performance enhancement
Bachelor's or Master's in Electrical Engineer, Computer Engineering, Computer Science, or a closely related field

Preferred

Exposure to systems architecture
Minimum 10 yrs experience in System or SOC level debug and triage
Proven ability to drive resolution of critical problems within a lab, Datacenter
Relationship with external customers/partners and able to help resolve problems in their Data Center
Relationship with external customers/partners on ability to work manufacturing issues/failures
Relationship with external customers/partners on ability to define rqmts for manufacturing validation
8+ years' working experience with network technologies including network selection and deployment in Datacenter environments
Experience with modern networking standards
Experience with mesh network routing protocols and switching protocols
Familiar with Ethernet and InfiniBand network designs and switch topologies
Linux Operating System as a development environment
Familiar with Ethernet and Infiniband networking in Linux and Windows environments
Familiar with Virtualization environments - KVM and HyperV
RDMA network configuration, troubleshooting
Linux kernel networking expertise
System/Platform level debug tools
Familiar with networking environments that utilizes HPC / ML/DL workloads
Hands on experience with lab equipment like oscilloscopes, protocol analyzers, power supplies, multi meter
Familiar with Platform/System bring up and validation of GPU networks - intranode and internode. (Networking Adapters, cables, switches)
Significant experience in SoC and/or System debug of complex network issues
Develop / Document debug capabilities on a given SOC and System
Go-to-person for debugging of issues for the Production level Platform validation
Collaborate with internal teams on root causing issues, finding optimum resolutions

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase