Amazon · 5 months ago
Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference
Amazon is a leading technology company, and they are seeking a Senior Software Development Engineer to join their Annapurna Labs team at AWS. This role involves architecting and implementing features for the AWS Neuron SDK, focusing on optimizing machine learning workloads and collaborating with various teams to enhance performance on AWS's custom ML accelerators.
Artificial Intelligence (AI)DeliveryE-CommerceFoundational AIRetail
Responsibilities
Help lead the efforts in building distributed inference support for Pytorch in the Neuron SDK
Tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and servers
Strong software development using Python, System level programming and ML knowledge are both critical to this role
Our engineers collaborate across compiler, runtime, framework, and hardware teams to optimize machine learning workloads for our global customer base
Working at the intersection of software, hardware, and machine learning systems, you'll bring expertise in low-level optimization, system architecture, and ML model acceleration
Would with state of the art LLMs, Open source and internal LLM families, large scale performance and benchmark evaluations etc
Develop and performance tune a wide variety of LLM model families, including 500B+ large language models like the Llama family, DeepSeek and beyond
Work side by side with performance, compiler and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia
Build infrastructure to systematically analyze and onboard multiple models with diverse architecture
Collaborate with performance team to enable and evaluate optimizations such as fusion, sharding, tiling, and scheduling etc
Conduct comprehensive testing, including unit and end-to-end model testing with continuous deployment and releases through pipelines
Work directly with customers to enable and optimize their ML models on AWS accelerators
Collaborate across teams to develop innovative optimization techniques
Build online/offline inference serving with vLLM, SGLang, TensorRT or similar platforms in production environments
Qualification
Required
5+ years of non-internship professional software development experience
5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution
Experience programming with at least one software programming language
Preferred
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Masters degree in computer science or equivalent
Benefits
Equity
Sign-on payments
Full range of medical, financial, and/or other benefits
Company
Amazon
Amazon is a tech firm with a focus on e-commerce, cloud computing, digital streaming, and artificial intelligence.
H1B Sponsorship
Amazon has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (22803)
2024 (21175)
2023 (19057)
2022 (24088)
2021 (12233)
2020 (14881)
Funding
Current Stage
Public CompanyTotal Funding
$8.11BKey Investors
AmazonKleiner Perkins
2023-01-03Post Ipo Debt· $8B
2001-07-24Post Ipo Equity· $100M
1997-05-15IPO
Recent News
The Motley Fool
2026-01-09
2026-01-09
2026-01-09
Company data provided by crunchbase