SIGN IN
LLM Serving Engineer (Cloud AI Engineering), Senior / Staff Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Qualcomm · 19 hours ago

LLM Serving Engineer (Cloud AI Engineering), Senior / Staff Engineer

Qualcomm Technologies, Inc. is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. The LLM Serving Engineer role involves building scalable LLM inference platforms and contributing to the development of LLM Serving packages, while collaborating closely with customers and internal teams.
Artificial Intelligence (AI)Generative AISoftwareTelecommunicationsWireless
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Building a scalable LLM inference platform using inference techniques (e.g. disaggregated serving and KV-Cache management, advanced parallelism, speculative algorithms, model optimization, specialized kernels)
Contribute to the development of LLM Serving packages (e.g. vLLM, SGLang, TGI, Triton-Inference server, Dynamo, LLM-d)
Work closely with customers to drive solutions by collaborating with internal compiler, firmware and platform teams
Work at the forefront of GenAI by understanding advanced algorithms (e.g. attention mechanisms, MoEs) and numerics to identify new optimization opportunities
Drive efficient serving through smart autoscaling, load balancing and routing
Engage with open-source serving communities to evolve the framework

Qualification

LLM serving packagesPyTorchComputer architectureDistributed systemsInference optimization techniquesLarge-scale projectsDeep learning workloadsProactive learningComputer science fundamentalsCommunicationProblem-solving skillsCollaborative environment

Required

Hands-on experience in one or more of the following LLM serving/Orchestration packages (Triton-Inference Server, vLLM, SGLang, Ollama, llm-d, KServe, LMCache, MoonCake)
Deep understanding of foundational LLMs, VLMs, SLMs, transformer-based architectures
Strong experience in developing language models using PyTorch
Strong computer science fundamentals - algorithms, data structures, parallel and distributed programming
Understanding of computer architecture, ML accelerators, in-memory processing and distributed systems
Strong Python development skills for large-scale projects with passion for software engineering
Experience in analyzing, profiling, and optimizing deep learning workloads
Proactive learning about the latest inference optimization techniques
Excellent communication and problem-solving skills, with the ability to thrive in a fast-paced and collaborative environment
MS in Computer Science, Machine Learning, Computer Engineering or Electrical Engineering
Bachelor's degree in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience
Master's degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience
PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience

Preferred

Open-source contribution to any GenAI package
Experience architecting and developing large-scale distributed systems
High-level kernel design experience (PyTorch, CUDA, Triton)
Knowledge of torch.compile or torchDynamo
PhD in Computer Science, Computer Engineering or Machine Learning

Benefits

Competitive annual discretionary bonus program
Opportunity for annual RSU grants
Highly competitive benefits package

Company

Qualcomm

company-logo
Qualcomm designs wireless technologies and semiconductors that power connectivity, communication, and smart devices.

H1B Sponsorship

Qualcomm has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2013)
2024 (1910)
2023 (3216)
2022 (2885)
2021 (2104)
2020 (1181)

Funding

Current Stage
Public Company
Total Funding
$3.5M
1991-12-20IPO
1988-01-01Undisclosed· $3.5M

Leadership Team

leader-logo
Cristiano Amon
President & CEO
linkedin
I
Isaac Eteminan
CEO
linkedin
Company data provided by crunchbase