eTeam ยท 8 hours ago
Senior Data Engineer (PySpark / AWS Big Data)
eTeam is seeking a Senior Data Engineer with expertise in PySpark and AWS Big Data. The role involves designing and building robust ETL/ELT pipelines, optimizing data processing, and ensuring data quality and consistency across ingestion flows.
Information Technology
Responsibilities
Design and build robust, scalable ETL/ELT pipelines using Pyspark to ingest data from diverse sources (databases, logs, APIs, files)
Transform and curate raw transactional and log data into analysis-ready datasets in the Data Hub and analytical data marts
Develop reusable and parameterized Spark jobs for batch and micro-batch processing
Optimize performance and scalability of Pyspark jobs across large data volumes
Ensure data quality, consistency, lineage, and proper documentation across ingestion flows
Collaborate with Data Architects, Modelers, and Data Scientists to implement ingestion logic aligned with business needs
Work with cloud-based data platforms (e.g., AWS S3, Glue, EMR, Redshift) for data movement and storage
Support version control, CI/CD, and infrastructure-as-code where applicable
Participate in Agile ceremonies and contribute to sprint planning, story grooming, and demos
Qualification
Required
4+ years of experience in data engineering, with strong focus on PySpark/Spark for big data processing
Expertise in building data pipelines and ingestion frameworks from relational, semi-structured (JSON, XML), and unstructured sources (logs, PDFs)
Proficiency in Python with strong knowledge of data processing libraries
Strong SQL skills for querying and validating data in platforms like Amazon Redshift, PostgreSQL, or similar
Experience with distributed computing frameworks (e.g., Spark on EMR, Databricks)
Familiarity with workflow orchestration tools (e.g., AWS Step Functions, or similar)
Solid understanding of data lake / data warehouse architectures and data modeling basics
Minimum Years of Experience: 8-15+
Certifications Needed: YES. AWS
Preferred
Experience with AWS data services: Glue, S3, Redshift, Lambda, CloudWatch, etc
Familiarity with Delta Lake or similar for large-scale data storage
Exposure to real-time streaming frameworks (e.g., Spark Structured Streaming, Kafka)
Knowledge of data governance, lineage, and cataloging tools (e.g., AWS Glue Catalog, Apache Atlas)
Understanding of DevOps/CI-CD pipelines for data projects using Git, Jenkins, or similar tools
Company
eTeam
eTeam is a staffing agency that also provides payrolling services.
H1B Sponsorship
eTeam has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (36)
2024 (205)
2023 (11)
2022 (7)
2021 (24)
2020 (25)
Funding
Current Stage
Late StageTotal Funding
unknown2023-12-04Acquired
Recent News
2025-11-19
2025-08-21
Company data provided by crunchbase