PySpark & Delta Lake Developer jobs in United States
cer-icon
Apply on Employer Site
company-logo

i4DM ยท 8 hours ago

PySpark & Delta Lake Developer

i4DM is an organization that provides federal agencies with access to experienced professionals who address unique challenges with tailored technologies. They are seeking a PySpark & Delta Lake Developer responsible for designing, building, and maintaining scalable ETL pipelines to process and analyze large-scale healthcare claims data.

ComputerInformation Technology
check
H1B Sponsor Likelynote

Responsibilities

Design, develop, and maintain robust ETL pipelines using PySpark and Delta Lake for large and complex healthcare data workloads
Implement and optimize data lake solutions using Delta Lake table formats, supporting ACID transactions, schema enforcement, and time travel
Write efficient, reusable, and well-documented PySpark scripts for data ingestion, transformation, cleansing, and aggregation
Collaborate with data engineers, architects, and data scientists to understand business and data requirements and translate them into scalable data solutions
Ensure data quality, consistency, lineage, and integrity across all stages of data processing
Troubleshoot, debug, and optimize PySpark applications and Delta Lake workflows for cost, speed, and reliability within AWS
Maintain detailed and up-to-date technical documentation of code, data pipelines, and standard operating procedures
Stay updated with the latest Delta Lake and Spark advancements, advocating for best practices in data management and analytics

Qualification

PySparkDelta LakeAWS cloud servicesSQLCI/CDData governanceHealthcare data experienceCommunication skills

Required

Strong proficiency in Python and PySpark, with hands-on experience developing data pipelines
Advanced experience with Delta Lake and its ACID transaction and schema management features
Solid SQL skills for querying, joining, and optimizing data in distributed environments
Hands-on experience with AWS cloud data services (e.g., S3, Glue, EMR, Athena)
Familiarity with data lake concepts, partitioning, and performance tuning
Excellent communication skills and a desire to continuously learn and adapt to innovative technologies
Familiarity with CI/CD, version control (e.g., Git), and infrastructure as code

Preferred

Experience with healthcare or claims data
Knowledge of data governance, security, data cataloging (AWS Glue Catalog), and compliance best practices
Strong ability to prioritize and execute tasks independently and within collaborative team environments
Previous experience working in a government or public sector setting

Company

i4DM

twittertwitter
company-logo
i4DM provides full range of information technology consulting services to government and commercial clients.

H1B Sponsorship

i4DM has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (1)
2021 (1)

Funding

Current Stage
Growth Stage
Company data provided by crunchbase