Apply on Employer Site

NVIDIA · 9 hours ago

AI Software Engineer, LLM Inference Performance Analysis - New College Grad 2026

US, NY, New York

Full-time

Onsite

New Grad, Entry Level

$124K/yr - $196K/yr

NVIDIA is at the forefront of the generative AI revolution. We are looking for a Software Engineer, Performance Analysis, and Optimization for LLM Inference, to join our performance engineering team, focusing on improving the efficiency and scalability of large language model inference on NVIDIA Computing Platforms.

AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Analyze the performance of LLMs running on NVIDIA Compute Platforms using profiling, benchmarking, and performance analysis tools

Understand and find opportunities for compiler optimization pipelines, including IR-based compiler middle-end optimizations and kernel-level transformations

Design and develop new compiler passes and optimizations techniques to deliver best-in-class, robust, and maintainable compiler infrastructure and tools

Collaborate with hardware architecture, compiler, and kernel teams to understand how firmware and circuitry co-design enables efficient LLM inference

Work with globally distributed teams across compiler, kernel, hardware, and framework domains to investigate performance issues and contribute to solutions

Qualification

C++PythonDeep learning modelsCompiler optimizationCUDA programmingDeep learning frameworksPerformance analysis toolsCommunication skillsTeam collaboration

Required

Master's or PhD in Computer Science, Computer Engineering, or a related field, or equivalent experience

Strong hands-on programming expertise in C++ and Python, with solid software engineering fundamentals

Foundational understanding of modern deep learning models (including transformers and LLMs) and interest in inference performance and optimization

Exposure to compiler concepts such as intermediate representations (IR), graph transformations, scheduling, or code generation through coursework, research, internships, or projects

Familiarity with at least one deep learning framework or compiler/runtime ecosystem (e.g., TensorRT-LLM, PyTorch, JAX/XLA, Triton, vLLM, or similar)

Ability to analyze performance bottlenecks and reason about optimization opportunities across model execution, kernels, and runtime systems

Experience working on class projects, internships, research, or open-source contributions involving performance-critical systems, compilers, or ML infrastructure

Strong communication skills and the ability to collaborate effectively in a fast-paced, team-oriented environment

Preferred

Proficiency in CUDA programming and familiarity with GPU-accelerated deep learning frameworks and performance tuning techniques

Showcase innovative applications of agentic AI tools that enhance productivity and workflow automation

Active engagement with the open-source LLVM or MLIR community to ensure tighter integration and alignment with upstream efforts

Benefits

Equity

Benefits

Company

NVIDIA

Glassdoor4.6

NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.

Founded in 1993

Santa Clara, California, USA

10001+ employees

https://www.nvidia.com

H1B Sponsorship

NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (1877)

2024 (1355)

2023 (976)

2022 (835)

2021 (601)

2020 (529)

Funding

Current Stage

Public Company

Total Funding

$4.09B

Key Investors

ARPA-EARK Investment ManagementSoftBank Vision Fund

2023-05-09Grant· $5M

2022-08-09Post Ipo Equity· $65M

2021-02-18Post Ipo Equity

Leadership Team

Jensen Huang

Founder and CEO

Michael Kagan

Chief Technology Officer

Recent News

SecneNow

Qatar Investment Authority & UAE's MGX Join xAI’s $20B Series E Round

2026-01-14

contxto.com

Venture capital in 2025: The year of artificial intelligence and mega-investment rounds

2026-01-14

Business Standard India

Nvidia to invest $1 billion with Eli Lilly to build AI-driven drug lab

2026-01-14

Company data provided by crunchbase