Senior DevOps Engineer, ML Infrastructure jobs in United States
cer-icon
Apply on Employer Site
company-logo

Serve Robotics · 3 weeks ago

Senior DevOps Engineer, ML Infrastructure

Serve Robotics is reimagining urban deliveries with their sidewalk robot designed for efficient and accessible delivery services. The Senior DevOps Engineer will play a crucial role in building and maintaining a large-scale ML platform, ensuring reliability and performance across internal systems while collaborating with data scientists and ML engineers.

Artificial Intelligence (AI)Food DeliveryLogisticsRobotics
check
H1B Sponsor Likelynote

Responsibilities

Deploy and maintain our ML training orchestration system that operates across multiple platforms
Manage cloud and on-premise environments for large-scale distributed data processing and ml training/inference systems
Automate deployment pipelines, monitoring, and alerting for ML and data services
Collaborate closely with data scientists, ML engineers, and autonomy teams to streamline experimentation and model deployment
Maintain and improve CI/CD systems to support rapid development and testing
Implement best practices for system security, reliability, and observability
Optimize infrastructure costs and ensure efficient resource utilization
Support internal developer productivity through tooling, documentation, and support

Qualification

Cloud platformsContainer orchestrationInfrastructure-as-codeCI/CD systemsPythonSQLCloud securityGPU cluster managementObservability stacksOpen-source contributions

Required

Bachelor's or Master's degree in Computer Science, Engineering, or equivalent experience
5+ years of experience as a DevOps, SRE, or Infrastructure Engineer, preferably supporting ML or data-intensive systems
Strong experience with cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes, Docker)
Proficiency in infrastructure-as-code tools such as Terraform or Helm
Solid understanding of CI/CD systems (GitLab CI, Jenkins, ArgoCD, etc.)
Experience with Python and SQL
Experience with cloud security, IAM (Identity and Access Management), and access control
Experience analysing and optimizing hardware performance
Experience with GPU cluster management

Preferred

Experience managing large-scale distributed data processing systems
Experience analysing and optimizing ml training workloads
Background in observability stacks (Prometheus, Grafana, ELK, OpenTelemetry)
Contributions to open-source DevOps or ML infrastructure projects

Company

Serve Robotics

twittertwittertwitter
company-logo
Serve Robotics is an autonomous robotic delivery company that develops AI-powered sidewalk delivery robots.

H1B Sponsorship

Serve Robotics has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (4)
2022 (1)
2021 (5)

Funding

Current Stage
Public Company
Total Funding
$394M
Key Investors
PostmatesNVIDIANeo
2025-10-10Post Ipo Equity· $100M
2025-01-07Post Ipo Equity· $80M
2024-12-01Post Ipo Equity· $86M

Leadership Team

leader-logo
Ali Kashani
Co-founder & CEO
linkedin
leader-logo
Dmitry Demeshchuk
Co-Founder and VP of Software
linkedin
Company data provided by crunchbase