Staff Software Engineer, ML Infra & Distributed Systems jobs in United States
cer-icon
Apply on Employer Site
company-logo

Tubi · 6 days ago

Staff Software Engineer, ML Infra & Distributed Systems

Tubi is a free streaming service that entertains over 100 million monthly active users. As a Staff Software Engineer on the ML Infrastructure team, you will collaborate closely with the Machine Learning and Product teams to build world-class machine learning inference platforms that power essential services like personalized recommendations and search.

AdvertisingDigital EntertainmentFilmMedia and Entertainment
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Design and build scalable, high throughput, and low latency distributed systems using Scala
Build reusable components and services that serve various ML applications like Personalization, Search, Ads and Exploration
Partner closely with ML engineers to understand their challenges and limitations and develop scalable solutions to address them. Proactively recommend solutions to keep our ML Inference stack state of the art
Take a data driven approach to identifying & optimizing latency, cost, and efficiency of our infra. Lead large scale cross functional refactorings if necessary
Mentor other engineers on the team on system design, effective incident management, interviewing, leveraging LLMs for work, etc
Collaborate with ML, Product, and cross functional engineering teams to define the long term vision and architecture for ML Infrastructure at Tubi

Qualification

ScalaAWSDistributed systemsMicroservicesSQL databasesNoSQL databasesContainerizationSoft skills

Required

8+ years of experience designing and building scalable, distributed systems in any modern backend language (e.g., Scala, Java, Python, Go, C++); experience with Scala or JVM based language is a plus
Strong experience with AWS or an equivalent cloud platform
Experience building online microservices at scale with low latency serving
Experience with both SQL (e.g. Postgres) and NoSQL databases (e.g. Cassandra), message brokers (e.g. Kafka), and caches (e.g. Redis)
Experience with containerization technologies, such as Docker or Kubernetes
Led the response and resolution efforts for multiple major, large-scale incidents

Preferred

Familiarity with the machine learning infrastructure like inference engines (e.g. torschserve, triton, vLLM), vector stores (e.g. LanceDB, FAISS), feature stores (e.g. Feast), ElastiCache, model training orchestration, etc
Understanding of ML model training pipelines and model internals. Experience with Recommender Systems, Search, Autocomplete and Ads ML is a plus
Previous experience with Akka, Erlang, Elixir or Go
Proficient in data-driven analysis of complex A/B testing results

Benefits

Annual discretionary bonus
Long-term incentive plan
Medical/dental/vision
Insurance
401(k) plan
Paid time off
Flexible Time off Policy
Generous Parental Leave Program
Monthly wellness reimbursement

Company

Tubi is the world’s largest ad-supported video on demand (VOD) service. Available on Connected TV devices, Mobile, and the Web.

H1B Sponsorship

Tubi has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (29)
2024 (13)
2023 (19)
2022 (21)
2021 (20)
2020 (19)

Funding

Current Stage
Late Stage
Total Funding
$28M
Key Investors
Jump Capital
2020-03-17Acquired
2019-12-23Series D
2017-07-01Pre Seed

Leadership Team

leader-logo
Anjali Sud
CEO
linkedin
leader-logo
Sameer Balgi
Chief Financial Officer
linkedin

Recent News

Company data provided by crunchbase