AI Cluster & Data Center Design Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Advanced Microdevices Pvt. Ltd. (India) · 1 month ago

AI Cluster & Data Center Design Engineer

Advanced Micro Devices, Inc is dedicated to transforming lives with technology and is seeking a highly skilled systems engineer to architect and design scalable AI/HPC clusters. This role involves evaluating and selecting components to optimize performance and reliability while collaborating with cross-functional teams to deliver cutting-edge infrastructure for AI and high-performance computing workloads.

BiopharmaBiotechnologyIndustrialManufacturing

Responsibilities

Design scalable AI/HPC clusters including compute, storage, and networking with specific focus on , power delivery
Evaluate and select CPUs, GPUs, accelerators, interconnects, and memory configurations for optimal cluster performance
Design leading-edge power delivery solutions for high-density AI/GPU deployments
Understand differences in power delivery and regulatory requirements in global locations, e.g. U.S., EMEA, Asia and other countries
Define power budgets, redundancy schemes, and fault tolerance mechanisms
Design network topologies to maximize overall cluster performance
Understand the network performance needs of different types of workloads
Understand advantages and performance trade-offs of network topologies for AI/HPC clusters
Design and optimize storage solutions to maximize AI/HPC cluster performance
Understand advantages and performance trade-offs of cluster storage solutions, e.g. Lustre, Ceph, etc
Work across multiple organizations with subject matter experts from hardware, software, network, data center, and operations teams to deliver scalable, efficient, and reliable compute infrastructure

Qualification

HPC systems engineeringAI infrastructure designPower delivery solutionsGPU/CPU architecturesNetworking InfiniBandNetworking EthernetAI/ML frameworksProblem-solving skillsCommunication skillsDocumentation skills

Required

Experience in HPC, AI infrastructure, or data center systems engineering
Strong understanding of rack and data center power delivery
Knowledge of GPU/CPU architectures, PCIe, UALink, InfiniBand, and Ethernet networking
Familiarity with AI/ML frameworks and workload characteristics
Excellent problem-solving, communication, and documentation skills
Bachelor's or Master's degree in Electrical Engineering, Computer Engineering, Computer Science or related field

Preferred

Experience designing power delivery solutions for racks and data centers
Contributions to open-source HPC or AI infrastructure projects

Benefits

AMD benefits at a glance.

Company

Advanced Microdevices Pvt. Ltd. (India)

twittertwittertwitter
company-logo
Advanced Microdevices (mdi) is a leader in innovative membrane technologies.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Nalini Kant Gupta
Founder & Managing Director
Company data provided by crunchbase