Principal Solutions Engineering – AI server/rack Infrastructure jobs in United States
cer-icon
Apply on Employer Site
company-logo

AMD · 1 day ago

Principal Solutions Engineering – AI server/rack Infrastructure

AMD is a company focused on building innovative products that enhance next-generation computing experiences. They are seeking a Principal Member of Technical Staff to lead system design support and customer engagement for their AMD Instinct product line, driving engineering and technical solutions for AI and server infrastructure.

AI InfrastructureArtificial Intelligence (AI)Cloud ComputingComputerEmbedded SystemsGPUHardwareSemiconductor
check
Growth Opportunities
badNo H1Bnote

Responsibilities

System Architecture & Design Support
Solution Optimization: Partner deeply with customers to architect and optimize Rack-Scale AI solution deployments using AMD Instinct GPUs
Design Reviews: Provide support of design reviews for customer platform/rack designs; proactively flag areas for modification to improve quality, performance and competitive advantage
Bring-Up, Debug & Validation
Documentation & Best Practices: Deliver comprehensive technical documentation, best practices, and reference architectures to streamline the adoption and deployment of AMD AI platforms
Hands-on Engineering: Drive hands-on rack, platform, and component-level debug and validation. This includes complex stress testing, issue reproductions, and deep-dive root cause analysis
Issue Resolution: Lead customer issue resolution efforts, gathering diagnostics, managing critical escalations, and driving long-term process improvements to ensure customer success
System Firmware Debug & Deployment: Lead debug efforts for system firmware (BIOS, BMC) during initial bring-up and large-scale deployment phases. Ensure seamless integration between hardware, firmware, and software stacks, and resolve interaction issues in customer environments
End-Customer Debug & Sustaining: Own the technical support interface for end customers, provide high-level engineering for deployed fleets
Cross-Functional Alignment: Represent debug progress, technical insights, and status with clarity and impact at the leadership level, ensuring alignment and accountability across cross-functional teams
Roadmap Influence: Provide regular, detailed technical feedback from the field to directly influence AMD’s software and hardware roadmaps
Future Architecture: Drive future product architecture decisions by leveraging unique insights gained from deep customer execution engagement
Mentorship: Build a culture of ownership, accountability, and technical excellence within the team, while actively mentoring senior engineers and emerging technical leaders

Qualification

System ArchitectureHardware/Firmware DebugCustomer EngagementRack-Scale AI SolutionsSystem FirmwareDebugging ToolsLeadershipMentorshipProblem-Solving

Required

Dynamic and experienced Principal Member of Technical Staff to own system design support, rack-level bring-up, and critical customer engagement for AMD Instinct product line
Act as the technical bridge between AMD's internal system architects, platform development teams, and OEM partners
Influence the design and architecture of AI solutions
Lead hands-on debug and validation efforts at customer locations
Drive engineering, root cause analysis, and influence future roadmaps based on field execution
System Architecture & Design Support
Solution Optimization: Partner deeply with customers to architect and optimize Rack-Scale AI solution deployments using AMD Instinct GPUs
Design Reviews: Provide support of design reviews for customer platform/rack designs; proactively flag areas for modification to improve quality, performance and competitive advantage
Bring-Up, Debug & Validation
Documentation & Best Practices: Deliver comprehensive technical documentation, best practices, and reference architectures to streamline the adoption and deployment of AMD AI platforms
Drive hands-on rack, platform, and component-level debug and validation
Lead customer issue resolution efforts, gathering diagnostics, managing critical escalations, and driving long-term process improvements to ensure customer success
Lead debug efforts for system firmware (BIOS, BMC) during initial bring-up and large-scale deployment phases
Ensure seamless integration between hardware, firmware, and software stacks, and resolve interaction issues in customer environments
Own the technical support interface for end customers, provide high-level engineering for deployed fleets
Cross-Functional Alignment: Represent debug progress, technical insights, and status with clarity and impact at the leadership level, ensuring alignment and accountability across cross-functional teams
Provide regular, detailed technical feedback from the field to directly influence AMD's software and hardware roadmaps
Drive future product architecture decisions by leveraging unique insights gained from deep customer execution engagement
Build a culture of ownership, accountability, and technical excellence within the team, while actively mentoring senior engineers and emerging technical leaders
Bachelors, Masters, or PhD in Electrical Engineering, Computer Engineering, or Computer Science

Preferred

Advanced experience in system architecture, hardware/firmware debug, and customer-facing engineering roles (HPC or AI/ML focus preferred)
Deep understanding of Server/Rack system architecture (x86, GPU, PCIe, Interconnects)
Strong proficiency in System Firmware (BIOS/UEFI, BMC/OpenBMC) debug, update flows, and deployment strategies
Experience with system bring-up and debugging tools (oscilloscopes, logic analyzers, ITP, JTAG)
Knowledge of power delivery, thermal management, and mechanical form factors in datacenter environments
Proven track record of leading technical teams through complex problem-solving scenarios and interacting with executive leadership
Ability to travel to customer, factory and company locations

Benefits

AMD benefits at a glance.

Company

Advanced Micro Devices is a semiconductor company that designs and develops graphics units, processors, and media solutions.

Funding

Current Stage
Public Company
Total Funding
unknown
Key Investors
OpenAIDaniel Loeb
2025-10-06Post Ipo Equity
2023-03-02Post Ipo Equity
2021-06-29Post Ipo Equity

Leadership Team

leader-logo
Lisa Su
Chair & CEO
linkedin
leader-logo
Mark Papermaster
CTO and EVP
linkedin
Company data provided by crunchbase