Principal DevOps Engineer - ML/AI Algorithms jobs in United States
cer-icon
Apply on Employer Site
company-logo

Roche · 3 months ago

Principal DevOps Engineer - ML/AI Algorithms

Roche is a global healthcare company dedicated to advancing science and ensuring access to healthcare. As a Principal DevOps Engineer - ML/AI Algorithms, you will collaborate with stakeholders to develop the build, release, and deploy toolchain for DevOps, contributing to digital health products that enhance patient care.

BiotechnologyHealth CareHealth DiagnosticsOncologyPharmaceuticalPrecision Medicine
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Lead the initiative to set up, manage, and meticulously maintain parity across development, staging, and production application environments in cutting-edge cloud infrastructure, ensuring a robust and consistent deployment pipeline
Champion the implementation of advanced monitoring infrastructure development, empowering the team with real-time insights and ensuring the highest levels of system reliability and performance
Provide dedicated on-call support for production operations, ensuring the uninterrupted delivery of critical services and swift resolution of any operational issues
Interface with software developers, product managers, test engineers and administrators on projects to design and develop the build, release, and deploy toolchain for DevOps while providing on-call support
Identify, troubleshoot and resolve issues quickly and effectively, sometimes under pressure
Actively involved in planning, high availability engineering, performance tuning, and automation/tools development
Manage multiple releases with focus on system reliability, scalability, and efficiency
Implement and manage the full lifecycle of machine learning models, including versioning, deployment strategies (e.g., canary, A/B testing), monitoring for drift and performance, and decommissioning
Bring in leadership quality to improve technology and process of devops as well as provide mentorship to other devops engineers in the team

Qualification

DevOps practicesContainer technologyMachine learning modelsAWS Cloud infrastructureUnix/Linux administrationInfrastructure as CodePythonSoft skills

Required

Bachelor's degree in Computer Science, Engineering, or a related field with a minimum of 8+ years of experience in a DevOps or equivalent combination of education and experience to perform at this level
8+ years of experience with container technology, including Kubernetes, AWS EKS, Helm Charts, Splunk, and Docker, along with provisioning infrastructure through IAC using Terraform and cloud automation principles
Proficiency in Unix/Linux administration in Shell scripting and internals with a preference for Ubuntu
Deep working experience and extensive knowledge in building and deploying infrastructure using IaC frameworks such as terraform and AWS Cloudformation/SAM
Experience building and automating scalable data pipelines for ingesting, transforming, distributed computing and versioning large-scale image datasets
Familiarity with DevOps practices and proficiency in log analysis and monitoring tools are essential for effective troubleshooting and system optimization
Proficiency in Python for automating production systems, including Git, Gitlab, Git actions, GitHub CI/CD, familiarity with common ML libraries such as TensorFlow, PyTorch, and scikit-learn to understand the engineering needs of the ML models you will be deploying
Strong working knowledge of AWS Cloud infrastructure, including EC2, S3, API Gateway, Kubernetics, RDS, VPC peering, Route53, S3, IAM, Batch, Lambda, AWS Config and Autoscaling

Preferred

MLOps experience with demonstrated experience supporting machine learning or computer vision teams
Deep experience with container orchestration for ML workloads using Kubernetes, including frameworks like Kubeflow or KubeRay to manage distributed training jobs
Familiarity with data versioning tools like DVC
Familiarity with common ML libraries such as TensorFlow, PyTorch, and scikit-learn to understand the engineering needs of the ML models
Familiarity with other languages such as Java, R, and C/C++
Experience with AWS services for machine learning, such as Amazon SageMaker, and experience managing GPU-accelerated compute instances (e.g., EC2 P and G series) for model training and inference

Benefits

A discretionary annual bonus may be available based on individual and Company performance.
This position also qualifies for the benefits detailed at the link provided below.

Company

Roche is a pharmaceutical and diagnostics company that offers medicines and diagnostic tests for various medical conditions and diseases.

H1B Sponsorship

Roche has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (12)
2024 (9)
2023 (6)
2022 (2)
2021 (2)

Funding

Current Stage
Public Company
Total Funding
$7.79B
Key Investors
SoftBankSCALE AINovartis
2021-08-04Post Ipo Equity· $5B
2020-12-07IPO
2020-05-06Post Ipo Equity· $0.5M

Leadership Team

leader-logo
Alan Hippe
Member of the Executive Board - Group CFO
linkedin
leader-logo
Christine Bakan
Global Head and VP of Software and Bioinformatics, Next Generation Sequencing
linkedin
Company data provided by crunchbase