Software Engineer, Inference jobs in United States
cer-icon
Apply on Employer Site
company-logo

Modular · 1 day ago

Software Engineer, Inference

Modular is on a mission to revolutionize AI infrastructure by rebuilding the AI software stack. The role involves building end-to-end distributed LLM inference deployments, focusing on operational excellence and collaboration with various teams to enhance application performance.

AI InfrastructureArtificial Intelligence (AI)Generative AIMachine LearningSoftware
check
H1B Sponsor Likelynote

Responsibilities

Build & ship Modular’s LLM focused inference services using best-in-class inference techniques (eg disaggregated inference, multi-node deployment of large models, high performance networking, high throughput batch processing, etc)
Build the distributed systems needed to support high performance inference (eg distributed kv-cache, expert parallel request routing & rebalancing, etc)
Push the envelope for operational excellence with request-to-kernel observability, multi-cloud deployments, clever autoscaling, cold-start optimizations, and more
Collaborate with our kernels and genAI teams to achieve SOTA application performance by integrating SOTA kernel & serving optimizations with SOTA cluster optimizations
Build helm charts, kubernetes operators, and more to make a create simple, effective, maintainable deployments

Qualification

Backend engineeringML inference infrastructureKubernetesSoftware tools developmentCultural alignmentHigh performance computingLLM FrameworksGolang familiarityProblem-solvingTeam-oriented attitude

Required

5+ years of experience working in backend engineering
Experience working on high scale ML inference infrastructure (traditional AI or genAI)
Experience with kubernetes and operating your own services
Ability to create durable, reusable software tools and libraries that are leveraged across teams and functions
Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture
Strongly identifies with our core company cultural values

Preferred

Experience with high performance computing / networking (RDMA, RoCE, Infiniband, etc)
Experience with LLM Frameworks vLLM, SGLang, TensorRT-LLM
Familiarity with golang

Benefits

Premier insurance plans
Up to 5% 401k matching
Flexible paid time off
Annual target bonus
Equity

Company

Modular

twittertwittertwitter
company-logo
Modular provides AI infrastructure for deployment, serving, and programming GPUs.

H1B Sponsorship

Modular has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (10)
2024 (6)
2023 (8)
2022 (4)

Funding

Current Stage
Growth Stage
Total Funding
$380M
Key Investors
US Innovative Technology FundGeneral CatalystGoogle Ventures
2025-09-24Series C· $250M
2023-08-24Series B· $100M
2022-06-30Seed· $30M

Leadership Team

leader-logo
Chris Lattner
CEO + Co-Founder
linkedin
leader-logo
Tim Davis
Co-Founder & President
linkedin
Company data provided by crunchbase