Senior Manager, AI/ML Forward Deployment Engineering jobs in United States
info-icon
This job has closed.
company-logo

AMD · 1 day ago

Senior Manager, AI/ML Forward Deployment Engineering

AMD is a company focused on building products that accelerate next-generation computing experiences. The Senior Manager, AI/ML Forward Deployment Engineering will lead the optimization of AI/ML fabric deployment and serve as a technical interface between customers and engineering teams, ensuring reliable and efficient infrastructure performance.

Artificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Collaborate with strategic customers on scalable designs involving compute, networking, storage environment, work with industry partners, Internal teams to accelerate the deployment, adoption of various AI/ML models
Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software, ensuring rapid resolution and system reliability
Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm, scaling up to pod and cluster level, leveraging the best in network architecture for AI/ML workloads
Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations
Engage with clients to deeply understand their technical needs, ensuring their satisfaction with tailored solutions that leverage your past experience in strategic customer engagements and architectural wins
Provide domain specific knowledge to other groups at AMD, share the lessons learnt to drive continuous improvement
Engage with AMD product groups to drive resolution of application and customer issues
Develop and present training materials to internal audiences, at customer venues, and at industry conferences

Qualification

AI/ML network deploymentsNetwork architecturePerformance tuningLarge-scale deploymentsRoCEv2 DesignVXLAN-EVPNBGPLossless FabricsCiscoJuniperAristaCustomer engagementTraining developmentGeographically dispersed teamCertifications in NetworkingCertifications in AI/MLCertifications in Cloud TechnologiesCommunication skillsTeam collaborationProblem-solving

Required

Extensive experience in large network architecture, Storage, AI/ML network deployments, and performance tuning
Disciplined approach to system triage, at-scale debug, and infrastructure optimization
Ability to work well within a team environment
Collaborate with strategic customers on scalable designs involving compute, networking, storage environment
Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software
Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm
Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations
Engage with clients to deeply understand their technical needs
Provide domain specific knowledge to other groups at AMD
Engage with AMD product groups to drive resolution of application and customer issues
Develop and present training materials to internal audiences, at customer venues, and at industry conferences
Bachelors, master's in computer science, Engineering or related subjects of experience
Experience in working with large customers such as Cloud Service Providers and global enterprise customers
Ability to work well in a geographically dispersed team

Preferred

Expertise in networking and performance optimization for large-scale AI/ML networks
Solid, hands-on expertise in at least one or more of 3 domains, namely compute, network, storage
Proven leadership in engaging customers with diverse technical disciplines
Direct experience in working with large customers and can operate with sense of urgency
Demonstrated leadership in network architecture, hands on experience in RoCEv2 Design, VXLAN-EVPN, BGP, and Lossless Fabrics
Proven ability to influence design and technology roadmaps
Extensive hands-on Network deployment expertise and proven track record of delivering large projects on time
Direct, co-development/deployment experience in working with strategic customers/partners
Excellent communication level from engineer to mid-management to C-level of audience
Certifications in Networking, AI/ML, or Cloud Technologies

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

H1B Sponsorship

AMD has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (836)
2024 (770)
2023 (551)
2022 (739)
2021 (519)
2020 (547)

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase