Principal Engineer- AI/ML Forward Deployment Engineering jobs in United States
info-icon
This job has closed.
company-logo

AMD · 6 days ago

Principal Engineer- AI/ML Forward Deployment Engineering

AMD is a company dedicated to building innovative products for next-generation computing experiences. The Principal Engineer role focuses on optimizing the design and deployment of AI/ML infrastructures while collaborating with various teams and customers to ensure robust performance and reliability.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Collaborate with strategic customers on scalable designs involving compute, networking, storage environment, work with industry partners, Internal teams to accelerate the deployment, adoption of various AI/ML models
Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software, ensuring rapid resolution and system reliability
Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm, scaling up to pod and cluster level, leveraging the best in network architecture for AI/ML workloads
Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations
Engage with clients to deeply understand their technical needs, ensuring their satisfaction with tailored solutions that leverage your past experience in strategic customer engagements and architectural wins
Provide domain specific knowledge to other groups at AMD, share the lessons learnt to drive continuous improvement
Engage with AMD product groups to drive resolution of application and customer issues
Develop and present training materials to internal audiences, at customer venues, and at industry conferences

Qualification

AI/ML network deploymentsNetwork architecturePerformance tuningPythonLinuxNetworking certificationsCloud Technologies certificationsCommunication skillsTeam collaborationLeadership

Required

Extensive experience in large network architecture, Storage, AI/ML network deployments, and performance tuning
Self-motivated and possess the ability to work well within a team environment
Collaborate with strategic customers on scalable designs involving compute, networking, storage environment
Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software
Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm
Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations
Engage with clients to deeply understand their technical needs
Provide domain specific knowledge to other groups at AMD
Engage with AMD product groups to drive resolution of application and customer issues
Develop and present training materials to internal audiences, at customer venues, and at industry conferences
Ability to work well in a geographically dispersed team
Bachelors, master's in computer science, Engineering or related subjects
This is a Senior level role; no recent college graduates will be considered

Preferred

Expertise in networking and performance optimization for large-scale AI/ML networks
Prefer candidates with solid, hands-on expertise in at least one or more of 3 domains, namely compute, network, storage
Demonstrated leadership in network architecture, hands on experience in RoCEv2 Design, VXLAN-EVPN, BGP, and Lossless Fabrics
Deep experience in working with large customers such as Cloud Service Providers and global enterprise customers
Proven leadership in engaging customers with diverse technical disciplines
Direct experience in working with large customers and can operate with sense of urgency
Extensive experience in Python, Linux, Kernel modules, Application libraries
Proven ability to influence design and technology roadmaps
Extensive hands-on Network deployment expertise and proven track record of delivering large projects on time
Cisco, Juniper or Arista Experience is required
Direct, co-development/deployment experience in working with strategic customers/partners
Excellent communication level from engineer to mid-management to C-level of audience
Certifications in Networking, AI/ML, or Cloud Technologies

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase