Booz Allen Hamilton · 16 hours ago
Apache Iceberg Data Engineer
Booz Allen Hamilton is a leading consulting firm that focuses on technology solutions for various missions. They are seeking an Apache Iceberg Data Engineer to develop and maintain data pipelines and workflows for large-scale datasets, ensuring efficiency and reliability while supporting mission-driven projects.
Cyber SecurityCloud ComputingConsultingIT InfrastructureManagement ConsultingSecurity
Responsibilities
Develop and maintain data pipelines and workflows for large-scale datasets, ensuring efficiency and reliability
Work with Apache Iceberg or table formats such as Delta Lake or Hudi, including data lake transactions, schema evolution, data version control, and partition optimization
Work with distributed file systems, such as S3, HDFS, or GCS, and implement scalable, high-performance data lake infrastructure
Work with query engines such as Presto, Trino, Spark, or Hive, integrating them with Iceberg-backed tables for efficient querying of large datasets
Implement scalable ETL/ELT processes to populate and maintain Iceberg tables using Python and programming languages such as Java or Scala
Manage data lifecycle including time-travel queries and optimizing data for both historical and real-time use cases
Debug and troubleshoot for data lake environments, addressing issues related to data consistency, governance, and performance bottlenecks
Design and document reusable, modular solutions for managing and interacting with Iceberg-backed datasets in complex ecosystems
Qualification
Required
2+ years of experience developing and maintaining data pipelines and workflows for large-scale datasets, ensuring efficiency and reliability
Experience working with Apache Iceberg or table formats such as Delta Lake or Hudi, including data lake transactions, schema evolution, data version control, and partition optimization
Experience working with distributed file systems, such as S3, HDFS, or GCS, and implementing scalable, high-performance data lake infrastructure
Experience with query engines such as Presto, Trino, Spark, or Hive, integrating them with Iceberg-backed tables for efficient querying of large datasets
Experience in Python and programming languages such as Java or Scala, including implementing scalable ETL/ELT processes to populate and maintain Iceberg tables
Experience working with data lifecycle management, including time-travel queries and optimizing data for both historical and real-time use cases
Knowledge of data lake and warehouse architecture principles and platforms, including best practices for storage optimization and modern lakehouse paradigms
Ability to debug and troubleshoot for data lake environments, addressing issues related to data consistency, governance, and performance bottlenecks, and design and document reusable, modular solutions for managing and interacting with Iceberg-backed datasets in complex ecosystems
Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
Bachelor's degree in Data Engineering or Computer Science
Preferred
Experience integrating Apache Iceberg with orchestration tools like Apache Airflow to automate workflows involving complex data lake operations
Experience with containerized environments such as Docker, and orchestration platforms such as Kubernetes, ensuring scalability for Iceberg-backed systems
Experience working with AWS Glue Catalog, Hive Metastore, or other metadata or catalog systems to efficiently manage Iceberg schema and table metadata
Experience adapting Iceberg implementations for the cloud
Experience implementing data governance principles, including role-based access and compliance policies, into Iceberg workflows
Knowledge of cloud-native object storage systems such as AWS S3, Azure Data Lake, or Google Cloud Storage
Knowledge of distributed computing systems, such as Spark or Flink, for both batch and real-time data processing involving Iceberg datasets
Knowledge of partitioning strategies and optimization techniques for performance tuning of Iceberg analytics
Knowledge of real-time data streaming and integrating tools such as Kafka with Iceberg for near-real-time ingestion and analytics
Knowledge of Agile engineering practices
Benefits
Health, life, disability, financial, and retirement benefits
Paid leave
Professional development
Tuition assistance
Work-life programs
Dependent care
Recognition awards program
Company
Booz Allen Hamilton
Booz Allen Hamilton is a consulting firm that specializes in analytics, technology, and engineering.
Funding
Current Stage
Public CompanyTotal Funding
$3.03B2025-03-11Post Ipo Debt· $650M
2023-08-01Post Ipo Debt· $650M
2020-08-13Post Ipo Debt· $700M
Leadership Team
Recent News
2026-02-07
2026-02-05
Washington Technology
2026-02-04
Company data provided by crunchbase