Baseten · 3 days ago
Software Engineer - Model Performance
Baseten powers mission-critical inference for leading AI companies and is seeking a Software Engineer focused on ML performance. The role involves advancing AI applications by implementing and optimizing techniques for ML model inference, particularly in large language models.
AI InfrastructureArtificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
Responsibilities
Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure
Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to debug ML performance issues
Apply and scale optimization techniques across a wide range of ML models, particularly large language models
Collaborate with a diverse team to design and implement innovative solutions
Own projects from idea to production
Qualification
Required
Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field
Experience with one or more general-purpose programming languages, such as Python or C++
Familiarity with LLM optimization techniques (e.g., quantization, speculative decoding, continuous batching)
Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM
Demonstrated interest and experience in LLM's
Deep understanding of GPU architecture
Preferred
Proficiency in enhancing the performance of software systems, particularly in the context of large language models (LLMs)
Experience with CUDA or similar technologies
Deep understanding of software engineering principles and a proven track record of developing and deploying AI/ML inference solutions
Experience with Docker and Kubernetes
Benefits
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Company
Baseten
Baseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes.
H1B Sponsorship
Baseten has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (8)
2023 (1)
2020 (1)
Funding
Current Stage
Late StageTotal Funding
$285MKey Investors
BondGreylock
2025-09-05Series D· $150M
2025-02-19Series C· $75M
2024-03-04Series B· $40M
Recent News
2026-01-07
2025-12-13
Tech Startups - Tech News, Tech Trends & Startup Funding
2025-12-11
Company data provided by crunchbase