AMD ยท 19 hours ago
Lead HPC Cluster Network Architect
AMD is a company focused on building innovative products that enhance computing experiences across various domains. The Lead HPC Cluster Network Architect will design advanced network architectures for AI/ML training and inference systems, collaborating with internal and external partners to optimize GPU communication and ensure seamless operation of AI clusters.
AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
Responsibilities
Designing state of the art cluster network architectures for large AI/ML training and inferencing systems which can be optimized for hyperscale capabilities
Engage with AMD customer base while aligning system and networking architecture
Standardize ethernet network architectures and best practices for GPU-to-GPU communication for deep learning and AI workloads using Infiniband and Ethernet technologies
Co-design new Ethernet technology with AMD partner companies to build the next generation of AI cluster networks
Pioneering system and container networking strategies to facilitate seamless operation and scaling of AI clusters
Developing scalable AI/ML training and inferencing communication network reference architectures for each generation of AMD AI/ML products
Serve as chief network engineer on projects supporting Partner OEM co-design of AI/ML clusters
Participate in design phase of each AMD AI/ML GPU generation by developing cluster communication network architectures and requirements
Collaborate across AMD internal and external partner teams to improve communication performance for AMD AI/ML clusters
Qualification
Required
Master's or PhD degree preferred in Mathematics, Statistics, Electrical Engineering, Computer Engineering, or a related computational field; equivalent experience also considered
Preferred
In-depth knowledge and experience with network topologies such as Rail and Fat Tree, and technologies including Infiniband, RDMA, RoCE, NVLINK, and PCIe
Expertise in cluster management, network security, automation, and workload management, along with a solid understanding of OSI network models and TCP/IP suites
Extensive real world experience designing for manageability and operation of hyperscale ethernet networks
Expert in verbal and written communication
Strong analytical/problem-solving skills and pronounced attention to details
Must be a self-starter, and able to independently drive tasks to completion
Benefits
AMD benefits at a glance.
Company
AMD
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity
Recent News
2026-01-13
Morningstar.com
2026-01-11
Company data provided by crunchbase