NVIDIA · 5 hours ago
AI Software Engineer, LLM Inference Performance Analysis - New College Grad 2026
NVIDIA is at the forefront of the generative AI revolution. We are looking for a Software Engineer, Performance Analysis, and Optimization for LLM Inference, to join our performance engineering team, focusing on improving the efficiency and scalability of large language model inference on NVIDIA Computing Platforms.
AI InfrastructureArtificial Intelligence (AI)Consumer ElectronicsFoundational AIGPUHardwareSoftwareVirtual Reality
Responsibilities
Analyze the performance of LLMs running on NVIDIA Compute Platforms using profiling, benchmarking, and performance analysis tools
Understand and find opportunities for compiler optimization pipelines, including IR-based compiler middle-end optimizations and kernel-level transformations
Design and develop new compiler passes and optimizations techniques to deliver best-in-class, robust, and maintainable compiler infrastructure and tools
Collaborate with hardware architecture, compiler, and kernel teams to understand how firmware and circuitry co-design enables efficient LLM inference
Work with globally distributed teams across compiler, kernel, hardware, and framework domains to investigate performance issues and contribute to solutions
Qualification
Required
Master's or PhD in Computer Science, Computer Engineering, or a related field, or equivalent experience
Strong hands-on programming expertise in C++ and Python, with solid software engineering fundamentals
Foundational understanding of modern deep learning models (including transformers and LLMs) and interest in inference performance and optimization
Exposure to compiler concepts such as intermediate representations (IR), graph transformations, scheduling, or code generation through coursework, research, internships, or projects
Familiarity with at least one deep learning framework or compiler/runtime ecosystem (e.g., TensorRT-LLM, PyTorch, JAX/XLA, Triton, vLLM, or similar)
Ability to analyze performance bottlenecks and reason about optimization opportunities across model execution, kernels, and runtime systems
Experience working on class projects, internships, research, or open-source contributions involving performance-critical systems, compilers, or ML infrastructure
Strong communication skills and the ability to collaborate effectively in a fast-paced, team-oriented environment
Preferred
Proficiency in CUDA programming and familiarity with GPU-accelerated deep learning frameworks and performance tuning techniques
Showcase innovative applications of agentic AI tools that enhance productivity and workflow automation
Active engagement with the open-source LLVM or MLIR community to ensure tighter integration and alignment with upstream efforts
Benefits
Equity
Benefits
Company
NVIDIA
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI.
H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1877)
2024 (1355)
2023 (976)
2022 (835)
2021 (601)
2020 (529)
Funding
Current Stage
Public CompanyTotal Funding
$4.09BKey Investors
ARPA-EARK Investment ManagementSoftBank Vision Fund
2023-05-09Grant· $5M
2022-08-09Post Ipo Equity· $65M
2021-02-18Post Ipo Equity
Recent News
2026-01-14
Business Standard India
2026-01-14
Company data provided by crunchbase