SRM Digital LLC · 20 hours ago
Lead Data Engineer (AWS)
SRM Digital LLC is seeking a Lead Data Engineer with expertise in AWS cloud data engineering. The role involves designing and maintaining scalable data pipelines, implementing layered data architectures, and collaborating with various teams to ensure data quality and governance.
Responsibilities
Design, build, and maintain scalable data pipelines ingesting data from APIs, files/SFTP, and relational sources
Implement layered data architectures (raw, clean, serving) using PySpark, SQL, dbt, and Python
Orchestrate workflows using Prefect or Airflow , including scheduling, retries, SLAs, parameterization, and operational runbooks
Develop and operate cloud-native data platforms leveraging object storage (S3/ADLS/GCS) and Spark-based compute (Databricks or equivalent)
Manage job configurations, secrets, access control, and environment-specific deployments
Publish and manage governed data services using Azure API Management (APIM) with authentication, authorization, versioning, quotas, and monitoring
Enforce data quality and governance through data contracts, automated validations, lineage, observability, and proactive alerting
Optimize performance and cost via partitioning, clustering, query tuning, workload management, and right-sizing compute
Ensure security and compliance standards are met, including PII handling, encryption, masking, and access controls
Collaborate with analytics, AI/ML engineering, and business teams to deliver production-ready, trusted datasets
Enable AI and LLM use cases by packaging datasets and metadata for downstream consumption, integrating with Model Context Protocol (MCP) where applicable
Continuously improve platform reliability and developer productivity through automation, technical debt reduction, and strong documentation practices
Qualification
Required
12+ years of professional experience in data engineering
Strong hands-on expertise in Python, SQL, and Spark (PySpark); Kafka experience is a plus
Experience with Snowflake (Snowpipe, Streams, Tasks) as a complementary data warehouse
Hands-on experience with Databricks (Delta Lake, workflows, catalogs) or equivalent Spark platforms
Proven experience building ETL/ELT pipelines using Prefect/Airflow, dbt, Spark, and/or Kafka
Experience onboarding datasets to cloud data platforms, including storage, compute, security, and governance
Familiarity with cloud data services across AWS, Azure, or GCP (e.g., S3/ADLS/GCS, Redshift/BigQuery, Glue/ADF)
Experience with Git-based CI/CD pipelines, containerization using Docker (Kubernetes is a plus)
Preferred
Kafka experience is a plus
Kubernetes is a plus
Company
SRM Digital LLC
We at SRM Digital are focused towards connecting businesses with top talent across various industries.
Funding
Current Stage
Early StageCompany data provided by crunchbase