AI/ML Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

ExecutivePlacements.com ยท 2 hours ago

AI/ML Data Engineer

ExecutivePlacements.com is working with College Board, a nonprofit organization focused on higher education. The AI/ML Data Engineer will design and build data and ML systems that enhance personalized student experiences, collaborating with various teams to ensure the delivery of impactful data products.

Human ResourcesOnline PortalsRecruiting
check
H1B Sponsorednote

Responsibilities

Design, build, and own batch and streaming ETL (e.g., Kinesis/Kafka ? Spark/Glue ? Step Functions/Airflow) for training, evaluation, and inference use cases
Stand up and maintain offline/online feature stores and embedding pipelines (e.g., S3/Parquet/Iceberg + vector index) with reproducible backfills
Implement data contracts & validation (e.g., Great Expectations/Deequ), schema evolution, and metadata/lineage capture (e.g., OpenLineage/DataHub/Amundsen)
Optimize lakehouse/warehouse layouts and partitioning (e.g., Redshift/Athena/Iceberg) for scalable ML and analytics
Productionize training and evaluation datasets with versioning (e.g., DVC/LakeFS) and experiment tracking (e.g., MLflow)
Collaborate with DS to ship models to serving (e.g., SageMaker/EKS/ECS), automate feature backfills, and capture inference data for continuous improvement
Define SLOs and instrument observability across data and model services (freshness, drift/skew, lineage, cost, and performance)
Embed security & privacy by design (PII minimization/redaction, secrets management, access controls), aligning with College Board standards and FERPA
Build CI/CD for data and models with automated testing, quality gates, and safe rollouts (shadow/canary)
Maintain docs-as-code for pipelines, contracts, and runbooks; create internal guides and tech talks
Mentor peers through design reviews, pair/mob sessions, and post-incident learning

Qualification

Data engineeringMachine LearningPythonETL pipelinesSQLAWS servicesMLOpsData qualityCollaborationCommunicationDocumentation

Required

4+ years in data engineering (or 3+ with substantial ML productionization), with strong Python and distributed compute (Spark/Glue/Dask) skills
Proven experience shipping ML data systems (training/eval datasets, feature or embedding pipelines, artifact/version management, experiment tracking)
MLOps/LLMOps: orchestration (Airflow/Step Functions), containerization (Docker), and deployment (SageMaker/EKS/ECS); CI/CD for data & models
Expert SQL and data modeling for lakehouse/warehouse (Redshift/Athena/Iceberg), with performance tuning for large datasets
Data quality & contracts (Great Expectations/Deequ), lineage/metadata (OpenLineage/DataHub/Amundsen), and drift/skew monitoring
Cloud experience preferably with AWS services such as S3, Glue, Lambda, Athena, Bedrock, OpenSearch, API Gateway, DynamoDB, SageMaker, Step Functions, Redshift and Kinesis BI tools like Tableau, Quicksight, or Looker for real-time analytics and dashboards
Security and privacy mindset; ability to design compliant pipelines handling sensitive student data
An ability to judiciously evaluate the feasibility, fairness, and effectiveness of AI solutions and articulate considerations and concerns around implementing models in the context of specific business applications
Excellent communication, collaboration, and documentation habits

Preferred

RAG & vector search experience (OpenSearch KNN/pgvector/FAISS) and prompt/eval frameworks
Real-time feature engineering (Kinesis/Kafka) and low-latency stores for online inference
Testing strategies for ML systems (unit/contract tests, data fuzzing, offline/online parity checks)
Experience in higher-ed/assessments data domains
A passion for expanding educational and career opportunities and mission-driven work
Authorization to work in the United States for any employer
Curiosity and enthusiasm for emerging technologies, with a willingness to experiment with and adopt new AI-driven solutions and a comfort learning and applying new digital tools independently and proactively
Clear and concise communication skills, written and verbal
A learner's mindset and a commitment to growth: welcoming diverse perspectives, giving and receiving timely, respectful feedback, and continuously improving through iterative learning and user input
A drive for impact and excellence: solving complex problems, making data-informed decisions, prioritizing what matters most, and continuously improving through learning, user input, and external benchmarking
A collaborative and empathetic approach: working across differences, fostering trust, and contributing to a culture of shared success

Benefits

A meaningful career
A supportive team
A comprehensive package designed to help you thrive
Fair and competitive compensation

Company

ExecutivePlacements.com

twittertwittertwitter
company-logo
Online recruitment

Funding

Current Stage
Early Stage
Company data provided by crunchbase