AMD · 7 hours ago
Software Development Engineer, AI Platforms
AMD is a company focused on building innovative products that enhance computing experiences across various domains. They are seeking a Principal Software Development Engineer to join their AI system optimization team, responsible for enabling efficient Generative AI training and inference at scale.
Embedded SoftwareArtificial Intelligence (AI)SemiconductorCloud ComputingElectronicsHardwareAI InfrastructureComputerEmbedded SystemsGPU
Responsibilities
Propose and apply innovative techniques to support both training and inferencing including innovative communication architectures, parallelism strategies to train on large clusters
Implement novel efficient architectures for Generative AI models for training and inference and showcase benefits on AMD
Work with open-source framework and community (e.g., PyTorch, SGLang, Hugging Face) to integrate AMD optimized models, libraries and publish training recipes
Collaborate with software and hardware team to E2E co-optimize performance on current and future AMD solutions
Publish and promote your work within AMD and at external venues
Qualification
Required
Deep technical understanding of image/video generation system, LLM parallelism, distributed inference framework
Hands on experience with communication middleware, e.g., NCCL / RCCL, MPI and RoCE v2
Experience training models at scale
Passionate about innovating efficient approaches to enable distributed training and inference at scale on AMD devices
PhD or master's degree with major in Computer Science Engineering, Electrical Engineering, Electronics Engineering, Mathematics, or a related field
Preferred
Strong technical expertise in communication middleware (e.g. NCCL/RCCL and MPI), and familiarity working with deep learning frameworks (e.g. Pytorch)
Strong technical expertise in benchmarking and performance optimization of distributed training and inference systems
Expertise/publications in one of the areas preferred - efficient model architectures, optimized training, innovative parallelism strategies or communication framework
Experience in Slurm and Kubernetes for managing the training and inference jobs over a cluster
Excellent written, verbal, and presentation skills, ability to coordinate internally and externally
Several years of experience in AI, deep learning and related software development
Benefits
AMD benefits at a glance.
Company
AMD
Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity
Recent News
2026-02-06
The Next Platform
2026-02-06
2026-02-06
Company data provided by crunchbase