Alabama Power Company ยท 3 months ago
ML Ops Engineer
Alabama Power Company, part of Southern Company, is a leading energy provider serving millions of customers. The ML Ops Engineer will design and operate the production backbone for the AI Hub, ensuring AI and machine learning systems are deployed, monitored, and governed at scale while driving the enterprise-wide MLOps framework.
Electrical DistributionEnergyLogisticsRetail
Responsibilities
Operationalize AI and agentic systems. Build and maintain CI/CD pipelines for models, prompts, tools, and multi-agent workflows, enabling consistent promotion from experimentation to production
Implement AI observability and reliability. Establish monitoring for agent behavior, model performance, drift, cost, and safety outcomes using logs, traces, metrics, and evaluators
Enforce governance through automation. Embed guardrails, approvals, and policy-as-code into deployment pipelines, enabling compliant AI delivery without manual bottlenecks
Manage model and agent lifecycle. Own versioning, rollout strategies (canary, shadow, rollback), and decommissioning for models, agents, and supporting tools
Ensure platform resilience and scalability. Design runtime patterns that meet availability, latency, and fail-safe requirements, including degraded-mode and read-only behaviors for sensitive use cases
Support multi-vendor and multi-cloud execution. Enable portable deployments across hyperscalers and model providers, minimizing lock-in while maintaining consistent operational controls
Partner with engineering and data teams. Work closely with AI Architects, data engineers, and product squads to resolve production issues and continuously improve developer experience
Qualification
Required
Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or related field
Proven experience (5+ years) in cloud engineering or Dev Ops with 2+ years in MLOps or AI infrastructure, Data Engineering, ML Engineering, or similar role
Experience operating machine learning and AI systems in regulated or mission-critical environments
Strong understanding of ML lifecycle management, including experimentation, validation, deployment, monitoring, and retirement
Familiarity with agentic AI runtime patterns, including orchestration, tool execution, and human-in the-loop controls
Knowledge of enterprise AI governance, observability, and maturity models
Operational mindset with strong ownership and bias toward reliability and automation
Ability to troubleshoot complex, distributed AI systems under production constraints
Clear communicator who can translate operational risks into actionable improvements
Continuous improvement orientation, balancing speed, safety, and cost
Hands-on expertise with CI/CD and MLOps tooling (e.g., GitHub Actions, Azure DevOps, Terraform)
Experience deploying and operating LLMs, agents, and inference services using containers and orchestration platforms (e.g., Kubernetes)
Proficiency in observability stacks for AI systems (logging, tracing, metrics, evaluation pipelines)
Strong grounding in cloud security and identity, including secrets management, network isolation, and least-privilege access
Experience with enterprise model registries, feature stores, vector databases, and automated testing for AI workflows
Deep expertise in Python. Experience with machine learning frameworks and libraries like PyTorch, or scikit-learn
Experience with ML lifecycle tools like MLflow
Experience with cloud computing services (Azure and GCP preferred) and their machine learning tools
Preferred
Relevant certifications in AI, ML, or data engineering
Experience in the energy sector is a plus
Experience in multi-cloud environment is a plus
Experience designing reusable AI products, agents, and services in a multi-business environment
Benefits
Competitive base salary
Annual incentive awards for eligible employees
Health, welfare and retirement benefits designed to support physical, financial, and emotional/social well-being
Incentive program
Company
Alabama Power Company
Alabama Power provides the valuable combination of It is a sub-organization of Southern Company.
Funding
Current Stage
Late StageLeadership Team
Recent News
Morningstar.com
2025-11-04
Company data provided by crunchbase