NVIDIA · 3 days ago
Senior Deep Learning Architect, LLM Inference
NVIDIA is at the forefront of the generative AI revolution, and they are seeking a Senior Deep Learning Architect for LLM Inference. The role involves characterizing LLMs and inference servers, collaborating with teams to optimize performance, and contributing to deep learning software projects.
Responsibilities
You will be responsible for characterizing the latest LLMs and inference servers like vLLM and SGLang to ensure that TRT-LLM maintains its leadership position
Join forces with the performance marketing team to build engaging content, including blog posts and other written materials, that highlight TRT-LLM's outstanding achievements
Collaborate with engineers from AI startup companies to debug and establish standard methodologies
Profile GPU kernel-level performance to identify hardware and software optimization opportunities
Develop profiling and analysis software tools that can keep up with the rapid pace of network scaling
Contribute to deep learning software projects, such as PyTorch, TRT-LLM, vLLM, and SGLang to drive advancements in the field
Verify that TRT-LLM's performance meets expectations for new GPU product launches
Collaborate across the company to guide the direction of inference serving, working with software, research, and product teams to ensure world-class performance
Qualification
Required
Master's or PhD degree in Computer Science, Computer Engineering, or related fields, or equivalent experience
6+ years of relevant industry experience
Detailed knowledge of deep learning inference serving, PyTorch programming, profiling, and compiler optimizations
Proficiency in Python and C++ programming languages and familiarity with CUDA
Experience with LLMs and their performance challenges and opportunities
Solid understanding of CPU and GPU microarchitecture and performance characteristics
Experience with complex software projects like frameworks, compilers, or operating systems
Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment
Preferred
Demonstrate a drive to continuously improve software and hardware performance
Showcase examples of novel use cases for agentic AI tools in the workplace
Experience with database and visualization tools like D3.js will set you apart
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
2026-01-11
2026-01-11
Company data provided by crunchbase