Amazon · 5 months ago
AGI Sr Inference Software Development Engineering, AGI Inference
Amazon is a leading technology company known for its innovative solutions in various domains. They are seeking a Senior Inference Software Development Engineer to join the Sensory Inference team, where the successful candidate will develop high-performance inference software and collaborate with research scientists to optimize neural models for diverse applications.
Artificial Intelligence (AI)DeliveryE-CommerceFoundational AIRetail
Responsibilities
Develop high-performance inference software for a diverse set of neural models, typically in C/C++
Design, prototype, and evaluate new inference engines and optimization techniques
Participate in deep-dive analysis and profiling of production code
Optimize inference performance across various platforms (on-device, cloud-based CPU, GPU, proprietary ASICs)
Collaborate closely with research scientists to bring next-generation neural models to life
Partner with internal and external hardware teams to maximize platform utilization
Work in an Agile environment to deliver high-quality software against tight schedules
Hold a high bar for technical excellence within the team and across the organization
Qualification
Required
5+ years of non-internship professional software development experience
5+ years of programming with at least one software programming language experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Experience as a mentor, tech lead or leading an engineering team
Bachelor's degree in Computer Science, Computer Engineering, or related field
Strong C/C++ programming skills
Solid understanding of deep learning architectures (CNNs, RNNs, Transformers, etc.)
Preferred
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Experience with inference frameworks such as PyTorch, TensorFlow, ONNXRuntime, TensorRT, LLaMA.cpp, etc
Proficiency in performance optimization for CPU, GPU, or AI hardware
Proficiency in kernel programming for accelerated hardware using programming models such as (but not limited to) CUDA, OpenMP, OpenCL, Vulkan, and Metal
Experience with latency-sensitive optimizations and real-time inference
Understanding of resource constraints on mobile/edge hardware
Knowledge of model compression techniques (quantization, pruning, distillation, etc.)
Experience with LLM efficiency techniques like speculative decoding and long context
Strong communication skills and ability to work in a collaborative environment
Passion for solving complex problems and driving innovation in AI technology
Benefits
A full range of medical, financial, and/or other benefits
Company
Amazon
Amazon is a tech firm with a focus on e-commerce, cloud computing, digital streaming, and artificial intelligence.
H1B Sponsorship
Amazon has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (22803)
2024 (21175)
2023 (19057)
2022 (24088)
2021 (12233)
2020 (14881)
Funding
Current Stage
Public CompanyTotal Funding
$8.11BKey Investors
AmazonKleiner Perkins
2023-01-03Post Ipo Debt· $8B
2001-07-24Post Ipo Equity· $100M
1997-05-15IPO
Recent News
The Motley Fool
2026-01-09
2026-01-09
2026-01-09
Company data provided by crunchbase