The College Board · 1 day ago
AI/ML Data Engineer
The College Board is a mission-driven, not-for-profit organization dedicated to excellence in education. As an AI/ML Data Engineer, you will design, build, and operate data and ML systems to enhance personalized student experiences and collaborate with cross-functional teams to turn raw data into impactful features for students and higher education partners.
Responsibilities
Design, build, and own batch and streaming ETL (e.g., Kinesis/Kafka → Spark/Glue → Step Functions/Airflow) for training, evaluation, and inference use cases
Stand up and maintain offline/online feature stores and embedding pipelines (e.g., S3/Parquet/Iceberg + vector index) with reproducible backfills
Implement data contracts & validation (e.g., Great Expectations/Deequ), schema evolution, and metadata/lineage capture (e.g., OpenLineage/DataHub/Amundsen)
Optimize lakehouse/warehouse layouts and partitioning (e.g., Redshift/Athena/Iceberg) for scalable ML and analytics
Productionize training and evaluation datasets with versioning (e.g., DVC/LakeFS) and experiment tracking (e.g., MLflow)
Build RAG foundations: document ingestion, chunking, embeddings, retrieval indexing, and quality evaluation (precision@k, faithfulness, latency, and cost)
Collaborate with DS to ship models to serving (e.g., SageMaker/EKS/ECS), automate feature backfills, and capture inference data for continuous improvement
Define SLOs and instrument observability across data and model services (freshness, drift/skew, lineage, cost, and performance)
Embed security & privacy by design (PII minimization/redaction, secrets management, access controls), aligning with College Board standards and FERPA
Build CI/CD for data and models with automated testing, quality gates, and safe rollouts (shadow/canary)
Maintain docs‑as‑code for pipelines, contracts, and runbooks; create internal guides and tech talks
Mentor peers through design reviews, pair/mob sessions, and post‑incident learning
Qualification
Required
4+ years in data engineering (or 3+ with substantial ML productionization), with strong Python and distributed compute (Spark/Glue/Dask) skills
Proven experience shipping ML data systems (training/eval datasets, feature or embedding pipelines, artifact/version management, experiment tracking)
MLOps/LLMOps: orchestration (Airflow/Step Functions), containerization (Docker), and deployment (SageMaker/EKS/ECS); CI/CD for data & models
Expert SQL and data modeling for lakehouse/warehouse (Redshift/Athena/Iceberg), with performance tuning for large datasets
Data quality & contracts (Great Expectations/Deequ), lineage/metadata (OpenLineage/DataHub/Amundsen), and drift/skew monitoring
Cloud experience preferably with AWS services such as S3, Glue, Lambda, Athena, Bedrock, OpenSearch, API Gateway, DynamoDB, SageMaker, Step Functions, Redshift and Kinesis BI tools like Tableau, Quicksight, or Looker for real-time analytics and dashboards
Security and privacy mindset; ability to design compliant pipelines handling sensitive student data
An ability to judiciously evaluate the feasibility, fairness, and effectiveness of AI solutions and articulate considerations and concerns around implementing models in the context of specific business applications
Excellent communication, collaboration, and documentation habits
Preferred
RAG & vector search experience (OpenSearch KNN/pgvector/FAISS) and prompt/eval frameworks
Real‑time feature engineering (Kinesis/Kafka) and low‑latency stores for online inference
Testing strategies for ML systems (unit/contract tests, data fuzzing, offline/online parity checks)
Experience in higher‑ed/assessments data domains
Benefits
Annual bonuses and opportunities for merit-based raises and promotions
A mission-driven workplace where your impact matters
A team that invests in your development and success
Company
The College Board
College Board is a not-for-profit organization that clears a path for all students to own their future through the Advanced Placement Program, the SAT, Official SAT Practice on Khan Academy, BigFuture, and more.
H1B Sponsorship
The College Board has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (7)
2024 (9)
2023 (8)
2022 (12)
2021 (5)
Funding
Current Stage
Late StageRecent News
2024-04-10
2023-02-15
2022-01-26
Company data provided by crunchbase