Senior Deep Learning Architect, LLM Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

NVIDIA · 3 days ago

Senior Deep Learning Architect, LLM Inference

NVIDIA is at the forefront of the generative AI revolution, and they are seeking a Senior Deep Learning Architect for LLM Inference. The role involves characterizing LLMs and inference servers, collaborating with teams to optimize performance, and contributing to deep learning software projects.

AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
check
Growth Opportunities
check
H1B Sponsor Likelynote
Hiring Manager
Joshua Hasten
linkedin

Responsibilities

You will be responsible for characterizing the latest LLMs and inference servers like vLLM and SGLang to ensure that TRT-LLM maintains its leadership position
Join forces with the performance marketing team to build engaging content, including blog posts and other written materials, that highlight TRT-LLM's outstanding achievements
Collaborate with engineers from AI startup companies to debug and establish standard methodologies
Profile GPU kernel-level performance to identify hardware and software optimization opportunities
Develop profiling and analysis software tools that can keep up with the rapid pace of network scaling
Contribute to deep learning software projects, such as PyTorch, TRT-LLM, vLLM, and SGLang to drive advancements in the field
Verify that TRT-LLM's performance meets expectations for new GPU product launches
Collaborate across the company to guide the direction of inference serving, working with software, research, and product teams to ensure world-class performance

Qualification

Deep Learning InferencePyTorchPythonC++CUDALLMsGPU MicroarchitectureCommunication SkillsCollaboration

Required

Master's or PhD degree in Computer Science, Computer Engineering, or related fields, or equivalent experience
6+ years of relevant industry experience
Detailed knowledge of deep learning inference serving, PyTorch programming, profiling, and compiler optimizations
Proficiency in Python and C++ programming languages and familiarity with CUDA
Experience with LLMs and their performance challenges and opportunities
Solid understanding of CPU and GPU microarchitecture and performance characteristics
Experience with complex software projects like frameworks, compilers, or operating systems
Good written and verbal communication skills and the ability to work independently and collaboratively in a fast-paced environment

Preferred

Demonstrate a drive to continuously improve software and hardware performance
Showcase examples of novel use cases for agentic AI tools in the workplace
Experience with database and visualization tools like D3.js will set you apart

Benefits

Equity
Benefits

Company

NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.

H1B Sponsorship

NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)

Funding

Current Stage
Public Company
Total Funding
$4.09B
Key Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity

Leadership Team

leader-logo
Jensen Huang
Founder and CEO
linkedin
leader-logo
Michael Kagan
Chief Technology Officer
linkedin
Company data provided by crunchbase