Advanced Microdevices Pvt. Ltd. (India) · 3 hours ago
Senior GPU Kubernetes Engineer
Advanced Micro Devices, Inc is dedicated to building innovative products that enhance computing experiences across various domains. The Senior GPU Kubernetes Engineer will lead GPU operator development and optimize AI workloads, ensuring effective integration and deployment automation for the AMD Enterprise AI Suite.
BiopharmaBiotechnologyIndustrialManufacturing
Responsibilities
Lead GPU Operator development; implement topology-aware scheduling policies; optimize NUMA placement, PCIe locality, and memory bandwidth; and ensure robust integration with AMD’s ROCm drivers and runtimes
Design autoscaling logic for GPU-heavy inference and fine-tuning workloads, build monitoring and telemetry instrumentation, strengthen workload reliability, and develop scalable Helm charts and automation workflows
Collaborate closely with ROCm, platform, performance, and model teams to ensure end-to-end integration quality; troubleshoot across GPU runtimes, Kubernetes layers, and AI frameworks; influence AMD’s Kubernetes roadmap; and support deployment models across customer, partner, and ecosystem environments
Qualification
Required
Strong Kubernetes engineering expertise
Deep understanding of GPU resource management
Hands-on experience optimizing AI workloads in cloud and on-prem environments
Proven track record in problem-solving, collaboration, and technical execution
Lead GPU Operator development
Implement topology-aware scheduling policies
Optimize NUMA placement, PCIe locality, and memory bandwidth
Ensure robust integration with AMD's ROCm drivers and runtimes
Design autoscaling logic for GPU-heavy inference and fine-tuning workloads
Build monitoring and telemetry instrumentation
Strengthen workload reliability
Develop scalable Helm charts and automation workflows
Collaborate closely with ROCm, platform, performance, and model teams
Troubleshoot across GPU runtimes, Kubernetes layers, and AI frameworks
Influence AMD's Kubernetes roadmap
Support deployment models across customer, partner, and ecosystem environments
BS, MS, or PhD in Computer Science or a related equivalent
Preferred
Strong hands-on experience with Kubernetes GPU workloads
Operator/CRD development
Scheduling plugins and resource managers
Proficiency with Helm, Kustomize, Prometheus, Grafana, FluentD/FluentBit, and ArgoCD
Deep understanding of NUMA, GPU topology, affinity/anti-affinity rules, and multi-GPU inference strategies
Familiarity with distributed inference frameworks such as vLLM, Triton, KServe, or Ray
Experience deploying LLM workloads
Knowledge of ROCm, AMD MI300/MI325 platforms, OpenShift, KubeVirt, or enterprise Kubernetes systems
Benefits
AMD benefits at a glance.
Company
Advanced Microdevices Pvt. Ltd. (India)
Advanced Microdevices (mdi) is a leader in innovative membrane technologies.
Funding
Current Stage
Late StageLeadership Team
Nalini Kant Gupta
Founder & Managing Director
Recent News
2024-10-18
2024-10-16
Company data provided by crunchbase