Apply on Employer Site

Amazon · 5 months ago

Sr. ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

Cupertino, California, USA

Full-time

Onsite

Senior Level

$151K/yr - $262K/yr

5+ years exp

Amazon, through its Annapurna Labs team, is seeking a Sr. ML Kernel Performance Engineer for AWS Neuron to enhance deep learning and GenAI workloads on custom machine learning accelerators. The role involves designing high-performance kernels, optimizing performance across hardware generations, and collaborating with customers to ensure optimal machine learning workloads on AWS accelerators.

Artificial Intelligence (AI)DeliveryE-CommerceFoundational AIRetail

No H1B

U.S. Citizen Only

Responsibilities

Design and implement high-performance compute kernels for ML operations, leveraging the Neuron architecture and programming models

Analyze and optimize kernel-level performance across multiple generations of Neuron hardware

Conduct detailed performance analysis using profiling tools to identify and resolve bottlenecks

Implement compiler optimizations such as fusion, sharding, tiling, and scheduling

Work directly with customers to enable and optimize their ML models on AWS accelerators

Collaborate across teams to develop innovative kernel optimization techniques

Qualification

ML kernel optimizationGPU programmingHPC architecturesPerformance analysisSoftware development lifecycleProgramming languagesParallel programmingMentorshipTeam collaboration

Required

5+ years of non-internship professional software development experience

5+ years of programming with at least one software programming language experience

5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience

5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience

Experience as a mentor, tech lead or leading an engineering team

Preferred

Bachelor's degree in computer science or equivalent

6+ years of full software development experience

Expertise in accelerator architectures for ML or HPC such as GPUs, CPUs, FPGAs, or custom architectures

Experience with GPU kernel optimization and GPGPU computing such as CUDA, NKI, Triton, OpenCL, SYCL, or ROCm

Demonstrated experience with NVIDIA PTX and/or AMD GPU ISA

Experience developing high performance libraries for HPC applications

Proficiency in low-level performance optimization for GPUs

Experience with LLVM/MLIR backend development for GPUs

Knowledge of ML frameworks (PyTorch, TensorFlow) and their GPU backends

Experience with parallel programming and optimization techniques

Understanding of GPU memory hierarchies and optimization strategies

Benefits

Flexibility in working hours

Full range of medical, financial, and/or other benefits

Company

Amazon

Glassdoor3.7

Amazon is a tech firm with a focus on e-commerce, cloud computing, digital streaming, and artificial intelligence.

Founded in 1994

Seattle, Washington, USA

10001+ employees

https://amazon.com

Funding

Current Stage

Public Company

Total Funding

$8.11B

Key Investors

AmazonKleiner Perkins

2023-01-03Post Ipo Debt· $8B

2001-07-24Post Ipo Equity· $100M

1997-05-15IPO

Leadership Team

Douglas J. Herrington

CEO, Worldwide Amazon Stores

Werner Vogels

VP & CTO

Recent News

TechCrunch

Amazon’s ‘Melania’ documentary stumbles in second weekend

2026-02-09

The Motley Fool

Why Amazon Stock Dropped This Week

2026-02-09

Benzinga.com

Daniel Newman Backs Amazon As UBS Forecasts AWS Growth Doubling To 38% In 2026: 'One Of Our Best Ideas,' Says Futurum CEO

2026-02-09

Company data provided by crunchbase