Baseten · 8 hours ago
Engineering Manager - Model Performance
Baseten is a rapidly growing company that powers mission-critical inference for dynamic AI companies. They are seeking an Engineering Manager focused on ML performance and inference to lead a team of engineers while remaining hands-on with technology.
Artificial Intelligence (AI)Developer ToolsMachine LearningSoftwareSoftware Engineering
Responsibilities
Lead, mentor, and manage a team of engineers focused on developing and optimizing ML model inference and performance
Oversee technical strategy and architecture decisions, driving improvements across our engineering organization
Collaborate with cross-functional teams to ensure seamless integration and scalability of ML models in production environments
Dive into the codebase of frameworks like TensorRT, PyTorch, CUDA, and others to identify and solve complex performance bottlenecks
Drive the development and deployment of large-scale optimization techniques for various ML models, especially large language models (LLMs)
Own the full lifecycle of projects from inception through delivery, including planning, execution, and resource management
Foster a collaborative, inclusive team environment that encourages continuous learning and growth
Qualification
Required
Bachelor's, Master's, or Ph.D. in Computer Science, Engineering, or a related field
5+ years of professional experience in software engineering, with at least 2 years in a technical leadership role
Proven experience managing and mentoring teams of engineers
Expertise in one or more programming languages, such as Python, C++, or Go
In-depth understanding of ML model performance optimization, especially using libraries such as PyTorch, TensorRT, and CUDA
Strong knowledge of containerization (Docker) and orchestration systems (Kubernetes)
Experience with production-level AI/ML solutions, including scaling and deploying large models
Ability to balance hands-on technical work with team leadership and project management
Preferred
Experience enhancing the performance of large language models (LLMs) or similar AI systems
Familiarity with LLM optimization techniques such as quantization, speculative decoding, or continuous batching
Deep knowledge of GPU architecture and performance tuning
Previous experience in a high-growth startup environment
Benefits
Competitive compensation, including meaningful equity.
100% coverage of medical, dental, and vision insurance for employee and dependents
Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
Paid parental leave
Company-facilitated 401(k)
Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Company
Baseten
Baseten is an AI infrastructure company that integrates machine learning into business operations, production, and processes.
H1B Sponsorship
Baseten has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (8)
2023 (1)
2020 (1)
Funding
Current Stage
Late StageTotal Funding
$285MKey Investors
BondGreylock
2025-09-05Series D· $150M
2025-02-19Series C· $75M
2024-03-04Series B· $40M
Recent News
2025-12-13
Tech Startups - Tech News, Tech Trends & Startup Funding
2025-12-11
Tech Startups - Tech News, Tech Trends & Startup Funding
2025-12-11
Company data provided by crunchbase