Apply on Employer Site

OKAYA INFOCOM · 6 hours ago

AWS Data Engineering Lead--New York, NY--Full Time

New York, NY

Full-time

Onsite

Senior Level, Lead/Staff

4+ years exp

OKAYA INFOCOM is seeking an AWS Data Engineering Lead to manage data engineering services and oversee cloud data platforms. The role involves designing and operating data pipelines, ensuring data quality and governance, and collaborating with stakeholders to deliver reliable datasets for analytics and AI applications.

ConsultingInformation TechnologySoftware

Hiring Manager

Sanjeev Kumar

Responsibilities

Ingest and model data from APIs, files/SFTP, and relational sources; implement layered architectures (raw/clean/serving) using PySpark/SQL and dbt, Python

Design and operate pipelines with Prefect (or Airflow), including scheduling, retries, parameterization, SLAs, and well documented runbooks

Build on cloud data platforms, leveraging S3/ADLS/GCS for storage and a Spark platform (e.g., Databricks or equivalent) for compute; manage jobs, secrets, and access

Publish governed data services and manage their lifecycle with Azure API Management (APIM) authentication/authorization, policies, versioning, quotas, and monitoring

Enforce data quality and governance through data contracts, validations/tests, lineage, observability, and proactive alerting

Optimize performance and cost via partitioning, clustering, query tuning, job sizing, and workload management

Uphold security and compliance (e.g., PII handling, encryption, masking) in line with firm standards

Collaborate with stakeholders (analytics, AI engineering, and business teams) to translate requirements into reliable, production ready datasets

Enable AI/LLM use cases by packaging datasets and metadata for downstream consumption, integrating via Model Context Protocol (MCP) where appropriate

Continuously improve platform reliability and developer productivity by automating routine tasks, reducing technical debt, and maintaining clear documentation

Qualification

AWS Data Engineering ServicesPythonSparkSQLSnowflakeDatabricksETL/ELTPrefectAzure/AWS/GCPStakeholder CommunicationRisk ManagementOperational Excellence

Required

AWS Data Engineering Services (EMR/Glue, Redshift, Aurora, S3, Lambda), Spark, Python, Collibra, Snowflake/Databricks, Tableau

Ingest and model data from APIs, files/SFTP, and relational sources; implement layered architectures (raw/clean/serving) using PySpark/SQL and dbt, Python

Design and operate pipelines with Prefect (or Airflow), including scheduling, retries, parameterization, SLAs, and well documented runbooks

Build on cloud data platforms, leveraging S3/ADLS/GCS for storage and a Spark platform (e.g., Databricks or equivalent) for compute; manage jobs, secrets, and access

Publish governed data services and manage their lifecycle with Azure API Management (APIM) authentication/authorization, policies, versioning, quotas, and monitoring

Enforce data quality and governance through data contracts, validations/tests, lineage, observability, and proactive alerting

Optimize performance and cost via partitioning, clustering, query tuning, job sizing, and workload management

Uphold security and compliance (e.g., PII handling, encryption, masking) in line with firm standards

Collaborate with stakeholders (analytics, AI engineering, and business teams) to translate requirements into reliable, production ready datasets

Enable AI/LLM use cases by packaging datasets and metadata for downstream consumption, integrating via Model Context Protocol (MCP) where appropriate

Continuously improve platform reliability and developer productivity by automating routine tasks, reducing technical debt, and maintaining clear documentation

4–15 years of professional data engineering experience

Strong Python, SQL, and Spark (PySpark) skills, and/or Kafka

Snowflake (Snowpipe, Tasks, Streams) as a complementary warehouse

Databricks (Delta formats, workflows, cataloging) or equivalent Spark platforms

Hands-on experience building ETL/ELT with Prefect (or Airflow), dbt, Spark, and/or Kafka

Experience onboarding datasets to cloud data platforms (storage, compute, security, governance)

Familiarity with Azure/AWS/GCP data services (e.g., S3/ADLS/GCS; Redshift/BigQuery; Glue/ADF)

Git-based workflows CI/CD and containerization with Docker (Kubernetes a plus)

Strategic Technical Leadership: Defining data architecture, evaluating new technologies, and setting technical standards for AWS-based pipelines

Stakeholder Communication: Bridging the gap between technical teams and business stakeholders, gathering requirements, and reporting progress

Risk Management: Proactively identifying potential bottlenecks in data workflows, security risks, or scalability issues

Operational Excellence: Implementing automation, optimizing costs, and maintaining high data quality standards

Company

OKAYA INFOCOM

OKAYA was established in 2006 with the mission to enable success for our partners through trust and commitment.

Founded in 2006

Holbrook, New York, USA

201-500 employees

https://www.okayainfo.com

Funding

Current Stage

Growth Stage

Leadership Team

Akshay Gupta

CEO

Company data provided by crunchbase