Principal GPU Software Performance Engineer — Agentic Performance Optimization & Automation jobs in United States
cer-icon
Apply on Employer Site
company-logo

Advanced Microdevices Pvt. Ltd. (India) · 3 weeks ago

Principal GPU Software Performance Engineer — Agentic Performance Optimization & Automation

Advanced Micro Devices, Inc is focused on building products that accelerate next-generation computing experiences. The Principal GPU Software Performance Engineer will design and implement automated performance optimization loops for AI workloads on AMD GPUs, collaborating with various teams to enhance performance and drive improvements.

BiopharmaBiotechnologyIndustrialManufacturing

Responsibilities

Build and maintain automation that continuously profiles and improves workload performance
Implement safe, data‑driven tuning of configurations and kernels to achieve measurable gains
Detect and triage performance regressions; ensure changes are validated in CI
Integrate with existing profiling, compiler, and build/test tooling
Produce clear reports and dashboards to communicate results to stakeholders
Create reusable tools and interfaces that teams can adopt with minimal effort
Partner across teams to prioritize, land, and maintain performance improvements

Qualification

GPU performance engineeringAutomation systemsPythonC++/Rust/GoDeep learning optimizationCI/CD pipelinesROCm/CUDA/HIPCommunication skillsCollaboration skillsProblem-solving skills

Required

Design and implement automated performance optimization loops
Collaborate with compiler, framework, and infra teams
Drive sophisticated, multi-system issues to resolution
Combine rigorous profiling, data-driven experimentation, and safe automation loops
Communicate effectively and work optimally with various teams
Build and maintain automation that continuously profiles and improves workload performance
Implement safe, data-driven tuning of configurations and kernels to achieve measurable gains
Detect and triage performance regressions; ensure changes are validated in CI
Integrate with existing profiling, compiler, and build/test tooling
Produce clear reports and dashboards to communicate results to stakeholders
Create reusable tools and interfaces that teams can adopt with minimal effort
Partner across teams to prioritize, land, and maintain performance improvements
B.S./M.S./Ph.D. in Computer Science, Electrical/Computer Engineering, or related field, or equivalent industry experience

Preferred

Strong background in GPU performance engineering and systems
Experience profiling and optimizing deep learning workloads on modern accelerators
Familiarity with ROCm, CUDA/HIP, Triton, or similar low-level stacks
Experience building automation or optimization systems, such as auto-tuning frameworks (e.g., TVM, auto-scheduler, Triton autotune)
Experiment management or large-scale benchmarking infrastructure
CI/CD pipelines focused on performance regression tracking
Familiarity with LLM/agent tooling
Orchestration frameworks, tool-calling, and structured logging for agents
Applying LLMs to code generation, refactoring, or performance investigation is a plus
Strong software engineering skills in Python and at least one of C++/Rust/Go
Experience with distributed training/inference and scaling across multi-GPU, multi-node clusters
Comfort working cross-functionally with framework and compiler teams
Infra/SRE and benchmarking teams
Customer-facing solution engineers

Benefits

AMD benefits at a glance.

Company

Advanced Microdevices Pvt. Ltd. (India)

twittertwittertwitter
company-logo
Advanced Microdevices (mdi) is a leader in innovative membrane technologies.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Nalini Kant Gupta
Founder & Managing Director
Company data provided by crunchbase