Capgemini · 8 hours ago
Azure Data Engineer - Pyspark
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world. They are seeking an Azure Data Engineer with expertise in PySpark to design and implement data pipelines, optimize performance, and collaborate with teams on CI/CD processes.
ConsultingInformation TechnologyInsurTechIT ManagementSoftware
Responsibilities
Design and implement batch and streaming data pipelines on Azure Databricks (PySpark); author Spark SQL for transformations and analytics
Orchestrate workflows with Azure Data Factory and/or Synapse pipelines; integrate with Lake Storage
Model and maintain lakehouse structures (e.g., Delta Lake), ensuring robust partitioning, schema evolution, and performance
Implement data quality checks, observability, and SLAs across pipelines
Optimize jobs for cost and performance (cluster sizing, caching, shuffle reduction, partition strategy)
Collaborate with architecture/platform teams on CI/CD (Azure DevOps), secrets management (Key Vault), and security (RBAC, PIM)
Contribute to governance and metadata practices; document lineage and technical design
Support release cycles, incident triage, and production hardening; drive continuous improvements
Qualification
Required
Hands on expertise in PySpark (DataFrames, Spark SQL, performance tuning) on Azure Databricks-this is the primary skill focus
Strong SQL and data modeling for analytical workloads (star/snowflake, lakehouse patterns)
Proven delivery with Azure Data Factory/Synapse for pipeline orchestration and scheduling
Solid knowledge of Azure storage (ADLS Gen2, partitions, file formats-Parquet/Delta)
Version control and CI/CD with Git/Azure DevOps; automated testing in data pipelines
Experience operating pipelines in production (monitoring, alerting, reliability)
Preferred
Microsoft Fabric exposure is nice to have (not mandatory)
Data governance tools (e.g., Purview), Power BI integration, Delta Live Tables
Python packaging best practices; basic PowerShell for automation
Domain experience in financial services/asset management
Benefits
Flexible work
Healthcare including dental, vision, mental health, and well-being programs
Financial well-being programs such as 401(k) and Employee Share Ownership Plan
Paid time off and paid holidays
Paid parental leave
Family building benefits like adoption assistance, surrogacy, and cryopreservation
Social well-being benefits like subsidized back-up child/elder care and tutoring
Mentoring, coaching and learning programs
Employee Resource Groups
Disaster Relief
Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
Life and disability insurance
Employee assistance programs
Other benefits as provided by local policy and eligibility
Company
Capgemini
Capgemini is a software company that provides consulting, technology, and digital transformation services.
H1B Sponsorship
Capgemini has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2856)
2024 (3012)
2023 (3424)
2022 (4392)
2021 (3311)
2020 (5871)
Funding
Current Stage
Public CompanyTotal Funding
$4.72B2025-09-18Post Ipo Debt· $4.72B
1999-04-01IPO
Recent News
2026-01-16
Company data provided by crunchbase