Backend Software Engineer (ML Infra) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Rockstar · 3 weeks ago

Backend Software Engineer (ML Infra)

Rockstar is a mobile-first digital product studio that creates extraordinary digital experiences. They are hiring a Backend Software Engineer (ML Infrastructure) to design, build, and scale core systems for large-scale model training and deployment, collaborating closely with ML engineers to enhance production-grade infrastructure.

Staffing & Recruiting

Responsibilities

Design and implement backend systems that support large-scale ML workloads, including fine-tuning and reinforcement learning
Build distributed training and inference pipelines that are efficient, fault-tolerant, and observable
Develop internal developer tools and platforms that make it easier for ML engineers to train, evaluate, and deploy models
Work on cloud-native systems using containers and orchestration (e.g., Kubernetes)
Optimize systems for performance, reliability, and cost efficiency, especially for GPU-heavy workloads
Implement monitoring, logging, and observability for long-running training jobs and production services
Partner closely with ML engineers to support evolving model architectures, training workflows, and evaluation needs
Translate ML requirements into scalable backend and infrastructure solutions

Qualification

Backend engineeringDistributed systemsPythonCloud-native systemsContainerizationKubernetesML infrastructureGPU workloadsMonitoringLoggingTechnical curiosityCollaboration

Required

1–3 years of backend engineering experience, ideally working on production systems
Strong fundamentals in distributed systems, networking, and backend architecture
Experience building systems that scale under real load
Comfortable working in Python and/or Go (or similar backend languages)
Excited to work on-site in San Francisco with a fast-moving early-stage team

Preferred

Experience with or exposure to ML infrastructure or ML platforms
Familiarity with GPU workloads, training pipelines, or inference systems
Experience with containerization and orchestration (Docker, Kubernetes)
Contributions to or deep familiarity with ML infrastructure libraries such as: Ray, vLLM, SGLang, or similar distributed ML systems

Company

Rockstar

twitter
company-logo
Rockstar is rebuilding the infrastructure for employability by collapsing the cost of hiring.

Funding

Current Stage
Early Stage
Company data provided by crunchbase