ML Ops Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Alabama Power Company ยท 3 months ago

ML Ops Engineer

Alabama Power Company, part of Southern Company, is a leading energy provider serving millions of customers. The ML Ops Engineer will design and operate the production backbone for the AI Hub, ensuring AI and machine learning systems are deployed, monitored, and governed at scale while driving the enterprise-wide MLOps framework.

Electrical DistributionEnergyLogisticsRetail
check
Comp. & Benefits

Responsibilities

Operationalize AI and agentic systems. Build and maintain CI/CD pipelines for models, prompts, tools, and multi-agent workflows, enabling consistent promotion from experimentation to production
Implement AI observability and reliability. Establish monitoring for agent behavior, model performance, drift, cost, and safety outcomes using logs, traces, metrics, and evaluators
Enforce governance through automation. Embed guardrails, approvals, and policy-as-code into deployment pipelines, enabling compliant AI delivery without manual bottlenecks
Manage model and agent lifecycle. Own versioning, rollout strategies (canary, shadow, rollback), and decommissioning for models, agents, and supporting tools
Ensure platform resilience and scalability. Design runtime patterns that meet availability, latency, and fail-safe requirements, including degraded-mode and read-only behaviors for sensitive use cases
Support multi-vendor and multi-cloud execution. Enable portable deployments across hyperscalers and model providers, minimizing lock-in while maintaining consistent operational controls
Partner with engineering and data teams. Work closely with AI Architects, data engineers, and product squads to resolve production issues and continuously improve developer experience

Qualification

MLOpsCI/CD pipelinesCloud engineeringMachine learning frameworksObservability stacksPythonCloud platformsCommunicatorOperational mindsetContinuous improvement

Required

Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or related field
Proven experience (5+ years) in cloud engineering or Dev Ops with 2+ years in MLOps or AI infrastructure, Data Engineering, ML Engineering, or similar role
Experience operating machine learning and AI systems in regulated or mission-critical environments
Strong understanding of ML lifecycle management, including experimentation, validation, deployment, monitoring, and retirement
Familiarity with agentic AI runtime patterns, including orchestration, tool execution, and human-in the-loop controls
Knowledge of enterprise AI governance, observability, and maturity models
Operational mindset with strong ownership and bias toward reliability and automation
Ability to troubleshoot complex, distributed AI systems under production constraints
Clear communicator who can translate operational risks into actionable improvements
Continuous improvement orientation, balancing speed, safety, and cost
Hands-on expertise with CI/CD and MLOps tooling (e.g., GitHub Actions, Azure DevOps, Terraform)
Experience deploying and operating LLMs, agents, and inference services using containers and orchestration platforms (e.g., Kubernetes)
Proficiency in observability stacks for AI systems (logging, tracing, metrics, evaluation pipelines)
Strong grounding in cloud security and identity, including secrets management, network isolation, and least-privilege access
Experience with enterprise model registries, feature stores, vector databases, and automated testing for AI workflows
Deep expertise in Python. Experience with machine learning frameworks and libraries like PyTorch, or scikit-learn
Experience with ML lifecycle tools like MLflow
Experience with cloud computing services (Azure and GCP preferred) and their machine learning tools

Preferred

Relevant certifications in AI, ML, or data engineering
Experience in the energy sector is a plus
Experience in multi-cloud environment is a plus
Experience designing reusable AI products, agents, and services in a multi-business environment

Benefits

Competitive base salary
Annual incentive awards for eligible employees
Health, welfare and retirement benefits designed to support physical, financial, and emotional/social well-being
Incentive program

Company

Alabama Power Company

company-logo
Alabama Power provides the valuable combination of It is a sub-organization of Southern Company.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Jeff Peoples
Chairman, President and Chief Executive Officer
linkedin
leader-logo
Moses Feagin
Executive Vice President, CFO and Treasurer
linkedin
Company data provided by crunchbase