i4DM ยท 8 hours ago
PySpark & Delta Lake Developer
i4DM is an organization that provides federal agencies with access to experienced professionals who address unique challenges with tailored technologies. They are seeking a PySpark & Delta Lake Developer responsible for designing, building, and maintaining scalable ETL pipelines to process and analyze large-scale healthcare claims data.
ComputerInformation Technology
Responsibilities
Design, develop, and maintain robust ETL pipelines using PySpark and Delta Lake for large and complex healthcare data workloads
Implement and optimize data lake solutions using Delta Lake table formats, supporting ACID transactions, schema enforcement, and time travel
Write efficient, reusable, and well-documented PySpark scripts for data ingestion, transformation, cleansing, and aggregation
Collaborate with data engineers, architects, and data scientists to understand business and data requirements and translate them into scalable data solutions
Ensure data quality, consistency, lineage, and integrity across all stages of data processing
Troubleshoot, debug, and optimize PySpark applications and Delta Lake workflows for cost, speed, and reliability within AWS
Maintain detailed and up-to-date technical documentation of code, data pipelines, and standard operating procedures
Stay updated with the latest Delta Lake and Spark advancements, advocating for best practices in data management and analytics
Qualification
Required
Strong proficiency in Python and PySpark, with hands-on experience developing data pipelines
Advanced experience with Delta Lake and its ACID transaction and schema management features
Solid SQL skills for querying, joining, and optimizing data in distributed environments
Hands-on experience with AWS cloud data services (e.g., S3, Glue, EMR, Athena)
Familiarity with data lake concepts, partitioning, and performance tuning
Excellent communication skills and a desire to continuously learn and adapt to innovative technologies
Familiarity with CI/CD, version control (e.g., Git), and infrastructure as code
Preferred
Experience with healthcare or claims data
Knowledge of data governance, security, data cataloging (AWS Glue Catalog), and compliance best practices
Strong ability to prioritize and execute tasks independently and within collaborative team environments
Previous experience working in a government or public sector setting
Company
i4DM
i4DM provides full range of information technology consulting services to government and commercial clients.
H1B Sponsorship
i4DM has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (1)
2021 (1)
Funding
Current Stage
Growth StageCompany data provided by crunchbase