eBay · 1 day ago
AI Platform Systems Software Engineer
eBay is a global ecommerce leader committed to changing the way the world shops and sells. They are seeking an experienced AI Platform Systems Software Engineer to design, implement, and optimize the core infrastructure for AI/ML workloads, impacting the scalability and performance of applications serving eBay's global marketplace.
AuctionsE-CommerceInternetMarketplaceRetail
Responsibilities
Design and scale services to orchestrate AI/ML clusters across cloud and on-prem environments, supporting VM and Kubernetes-based deployments, including Ray (ray.io) clusters for distributed training and online inference
Develop and optimize intelligent scheduling and resource management systems for heterogeneous compute clusters (CPU, GPU, accelerators)
Integrate Ray Train/Tune for large-scale distributed training workflows and Ray Serve for low-latency, autoscaled inference; build platform hooks for observability, canary/A-B rollouts, and fault tolerance
Build features to improve reliability, performance, observability, and cost-efficiency of AI workloads at scale
Enhance the control plane to support secure multi-tenancy and enterprise-grade governance
Implement systems for container management, dependency resolution, and large-scale model distribution
Collaborate with ML researchers, applied scientists, and distributed systems engineers to drive platform innovation
Provide production support and work closely with field teams to resolve infrastructure issues
Qualification
Required
Bachelor's or Master's degree in Computer Science, Engineering, or related field (or equivalent experience)
8-10 years of experience building and maintaining infrastructure for highly available, scalable, and performant distributed systems
Proven expertise with cloud-native technologies (AWS, GCP, Azure) and Kubernetes-based deployments
Hands-on experience running ML training and inference with Ray (ray.io)—e.g., Ray Train/Tune for distributed training and Ray Serve for production inference—covering autoscaling, fault tolerance, observability and multi-tenant operations
Deep understanding of networking, security, authentication, and identity management in distributed/cloud environments
Hands-on experience with observability stacks (Prometheus, Grafana, OpenTelemetry, etc.)
Strong coding skills in Go and/or Python; familiarity with other systems-level languages is a plus
Knowledge of Linux internals, containers, and storage systems
Experience optimizing for GPU/accelerator integration (NVIDIA, AMD, TPU, etc.) is highly desirable
Benefits
Target bonus
Restricted stock units
Full range of medical
Financial
401(k) eligibility
Various paid time off benefits
PTO
Parental leave
Company
eBay
eBay is a global online marketplace enabling users to buy, sell, and auction new or used items across various categories.
H1B Sponsorship
eBay has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (572)
2024 (883)
2023 (779)
2022 (682)
2021 (748)
2020 (766)
Funding
Current Stage
Public CompanyTotal Funding
$1.16BKey Investors
Benchmark
2022-11-07Post Ipo Debt· $1.15B
1998-09-24IPO
1998-01-01Series Unknown
Recent News
2025-12-31
Yahoo Finance
2025-12-30
Retail Dive
2025-12-24
Company data provided by crunchbase