Senior Software Engineering Manager, ML Fleet Systems jobs in United States
cer-icon
Apply on Employer Site
company-logo

Google · 1 hour ago

Senior Software Engineering Manager, ML Fleet Systems

Google is seeking a Senior Software Engineering Manager for their ML Fleet Systems team. This role involves providing technical leadership and managing a team of engineers to optimize and deliver efficient ML infrastructure solutions that align with Google's AI-first strategy.

AppsArtificial Intelligence (AI)Cloud StorageSearch EngineSEO
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Define and drive the long-term technical goal, strategy, and roadmap for critical software systems that manage Alphabet's ML fleet. This includes building systems for all ML resources such as TPUs, GPUs, compute, storage, and networking
Collaborate closely with engineering partners (e.g., Onefleet, Spatial Flex, ODS) to design and deliver joint engineered solutions to our customers (Product Areas within Google)
Identify, scope, and solve broad and ambiguous challenges that impact the efficiency, reliability, and cost-effectiveness of the entire ML fleet. Turn these challenges into strategic opportunities and actionable plans
Drive significant improvements in ML fleet metrics, such as utilization, scheduling efficiency, and power consumption, through innovative software and system design
Ensure the long-term health, maintainability, and evolution of the software systems underpinning Google's AI/ML development

Qualification

C++JavaPythonTechnical leadershipDistributed systemsGoogle storage systemsInfrastructure optimizationMachine Learning hardwareResource management systemsPeople managementMatrixed organizationPerformance analysisCost reductionCluster managementScheduling algorithmsTeam leadership

Required

Bachelor's degree, or equivalent practical experience
8 years of experience programming in C++, Java, Python, Kotlin or Go
5 years of experience in a technical leadership role
5 years of experience in a people management or team leadership role
3 years of experience in designing, analyzing, and troubleshooting distributed systems

Preferred

Master's degree or PhD in Computer Science or related technical field
5 years of experience working in a complex, matrixed organization
Experience with colossus and other relevant Google storage systems (e.g., Bigtable, Spanner, Woodshed)
Experience with infrastructure optimization, performance analysis, and cost reduction in large-scale environments
Familiarity with Machine Learning hardware accelerators (e.g., TPUs, GPUs) and their life-cycle management
Understanding of resource management systems (e.g., compute infrastructure, Kubernetes, Flex), cluster management, and scheduling algorithms

Benefits

Health, dental, vision, life, disability insurance
Retirement Benefits: 401(k) with company match
Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
Sick Time: 40 hours/year (statutory, where applicable); 5 days/event (discretionary)
Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
Baby Bonding Leave: 18 weeks
Holidays: 13 paid days per year

Company

Google specializes in internet-related services and products, including search, advertising, and software. It is a sub-organization of Alphabet.

H1B Sponsorship

Google has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8763)
2024 (8872)
2023 (9682)
2022 (11626)
2021 (9109)
2020 (9785)

Funding

Current Stage
Public Company
Total Funding
$26.1M
Key Investors
Andy Bechtolsheim
2004-08-19IPO
1999-06-07Series Unknown· $25M
1998-11-01Angel· $1M

Leadership Team

leader-logo
Sundar Pichai
CEO
linkedin
leader-logo
Thomas Kurian
CEO - Google Cloud
linkedin
Company data provided by crunchbase