SIGN IN
Senior Data Engineer / Machine Learning Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

AI20 Labs · 5 hours ago

Senior Data Engineer / Machine Learning Engineer

AI20 Labs specializes in providing advanced AI-driven solutions, empowering users with personalized AI agents tailored to their unique needs. The role involves building a high-fidelity, end-to-end digital twin of the electric grid, focusing on data engineering, feature engineering, machine learning, and MLOps to transform raw grid data into trusted intelligence.
Computer Software

Responsibilities

Design and build scalable data ingestion pipelines for diverse sources:
Grid and market data (e.g., telemetry, operational datasets, filings)
Geospatial data (satellite imagery, maps, infrastructure layers)
Weather and environmental data
Time-series load and generation data
Create clean, versioned, query-able datasets suitable for both ML training and analytics
Develop canonical data models / schemas representing grid topology and asset relationships
Ensure data quality, lineage, reproducibility, and observability across pipelines
Engineer temporal, spatial, and relational features across heterogeneous datasets
Build representations that capture:
Network topology (connectivity, constraints, hierarchy)
Time-dependent behavior (load, generation, congestion, weather)
Physical constraints and operational limits
Collaborate with physics-based modeling efforts (e.g., power-flow abstractions) and integrate outputs into ML workflows
Train and deploy time-series forecasting models for:
Load
Renewable generation (wind, solar)
Grid conditions and system stress indicators
Work with multi-horizon forecasting (short-term operational + long-term planning)
Implement models ranging from:
Classical statistical methods (when appropriate)
Modern ML approaches (deep learning, sequence models, hybrid physics-ML models)
Evaluate models rigorously using real-world performance metrics, not just offline benchmarks
Design end-to-end ML pipelines:
Data ingestion → feature generation → training → validation → deployment → monitoring → retraining
Build reliable inference pipelines that support near-real-time and batch workflows
Implement:
Model versioning
Automated retraining
Drift detection
Performance monitoring
Work closely with product and platform engineers to integrate ML outputs into customer-facing systems

Qualification

Data engineeringMachine learningTime-series forecastingPythonMLOpsGeospatial dataData pipelinesSystems thinkingFeature engineeringModel versioning

Required

7+ years of experience in data engineering, ML engineering, or applied ML roles
Proven experience deploying ML systems into production (not just notebooks)
Strong background in time-series data (forecasting, anomaly detection, temporal feature engineering)
Deep proficiency in Python and modern data/ML libraries
Experience building scalable data pipelines (batch and streaming)
Strong systems thinking — ability to reason about end-to-end data and model lifecycles

Preferred

Experience with Databricks, Spark, or similar large-scale data platforms
Geospatial data experience (GIS, raster/vector data, spatial joins, map-based features)
Experience in weather, energy, load forecasting, or infrastructure modeling
Familiarity with MLOps frameworks and best practices
Experience working with messy, real-world datasets and ambiguous problem statements
Exposure to hybrid physics + ML systems or domain-constrained modeling

Benefits

Competitive salary (commensurate with seniority and experience)
Meaningful equity in a deeply technical, mission-driven company
High ownership, low bureaucracy, real technical impact

Company

AI20 Labs

twitter
company-logo
Your personal AI team, on-call, on-vibe, always shipping Unlimited agents development.

Funding

Current Stage
Early Stage
Company data provided by crunchbase