ML Ops Support Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Vibotek LLC ยท 2 months ago

ML Ops Support Engineer

Vibotek LLC is seeking an MLOps L2 Support Engineer to provide 24/7 production support for machine learning and data pipelines. The role involves troubleshooting ML workflows and ensuring high availability of ML models in production environments.

Information Technology & Services

Responsibilities

Provide L2 support for MLOps production environments, ensuring uptime and reliability
Troubleshoot ML pipelines, data processing jobs, and API issues
Monitor logs, alerts, and performance metrics using Dataiku, Prometheus, Grafana, or AWS tools such CloudWatch
Perform root cause analysis (RCA) and resolve incidents within SLAs
Escalate unresolved issues to L3 engineering teams when needed
Manage Dataiku DSS workflows, troubleshoot job failures, and optimize performance
Monitor and support Dataiku plugins, APIs, and automation scenarios
Collaborate with Data Scientists and Data Engineers to debug ML model deployments
Perform version control and CI/CD integration for Dataiku projects
Support CI/CD pipelines for ML model deployment (Bamboo, Bitbucket etc)
Deploy ML models and data pipelines using Docker, Kubernetes, or Dataiku Flow
Automate monitoring and alerting for ML model drift, data quality, and performance
Monitor AWS-based ML workloads (SageMaker, Lambda, ECS, S3, RDS)
Manage storage and compute resources for ML workflows
Support database connections, data ingestion, and ETL pipelines (SQL, Spark, Kafka)
Ensure secure access control for ML models and data pipelines
Support audit, compliance, and governance for Dataiku and MLOps workflows
Respond to security incidents related to ML models and data access

Qualification

MLOpsDataiku DSSAWS ML servicesCI/CD pipelinesPythonMonitoring toolsIncident ResponseDockerKubernetesSQLBashITIL certificationsDevOps certifications

Required

5+ years in MLOps, Data Engineering, or Production Support
Strong experience in Dataiku workflows, scenarios, plugins, and APIs
Hands-on experience with AWS ML services (SageMaker, Lambda, S3, RDS, ECS, IAM)
Familiarity with GitHub Actions, Jenkins, or Terraform
Proficiency in Python, Bash, SQL for automation & debugging
Experience with Prometheus, Grafana, CloudWatch, or ELK Stack
Ability to handle on-call support, weekend shifts, and SLA-based issue resolution

Preferred

Experience with Docker, Kubernetes, or OpenShift
Familiarity with TensorFlow Serving, MLflow, or Dataiku Model API
Experience with Spark, Databricks, Kafka, or Snowflake
ITIL Foundation, AWS ML certifications; Dataiku certification

Company

Vibotek LLC

twitter
company-logo
We screen and shortlist candidates before presenting to our clients. Therefore reducing hiring time and cost.

Funding

Current Stage
Early Stage
Company data provided by crunchbase