Lead HPC Cluster Network Architect jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD ยท 19 hours ago

Lead HPC Cluster Network Architect

AMD is a company focused on building innovative products that enhance computing experiences across various domains. The Lead HPC Cluster Network Architect will design advanced network architectures for AI/ML training and inference systems, collaborating with internal and external partners to optimize GPU communication and ensure seamless operation of AI clusters.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
badNo H1Bnote

Responsibilities

Designing state of the art cluster network architectures for large AI/ML training and inferencing systems which can be optimized for hyperscale capabilities
Engage with AMD customer base while aligning system and networking architecture
Standardize ethernet network architectures and best practices for GPU-to-GPU communication for deep learning and AI workloads using Infiniband and Ethernet technologies
Co-design new Ethernet technology with AMD partner companies to build the next generation of AI cluster networks
Pioneering system and container networking strategies to facilitate seamless operation and scaling of AI clusters
Developing scalable AI/ML training and inferencing communication network reference architectures for each generation of AMD AI/ML products
Serve as chief network engineer on projects supporting Partner OEM co-design of AI/ML clusters
Participate in design phase of each AMD AI/ML GPU generation by developing cluster communication network architectures and requirements
Collaborate across AMD internal and external partner teams to improve communication performance for AMD AI/ML clusters

Qualification

HPC Cluster Network DesignInfinibandEthernet TechnologiesCluster ManagementNetwork SecurityAutomationWorkload ManagementOSI Network ModelsTCP/IP SuitesAnalytical SkillsProblem-Solving SkillsAttention to DetailVerbal CommunicationWritten CommunicationSelf-Starter

Required

Master's or PhD degree preferred in Mathematics, Statistics, Electrical Engineering, Computer Engineering, or a related computational field; equivalent experience also considered

Preferred

In-depth knowledge and experience with network topologies such as Rail and Fat Tree, and technologies including Infiniband, RDMA, RoCE, NVLINK, and PCIe
Expertise in cluster management, network security, automation, and workload management, along with a solid understanding of OSI network models and TCP/IP suites
Extensive real world experience designing for manageability and operation of hyperscale ethernet networks
Expert in verbal and written communication
Strong analytical/problem-solving skills and pronounced attention to details
Must be a self-starter, and able to independently drive tasks to completion

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase