Incedo Inc. · 1 month ago
Data Team Lead
Incedo Inc. is seeking a Databricks Data Lead to support the design, implementation, and optimization of cloud-native data platforms built on the Databricks Lakehouse Architecture. The role involves close onsite collaboration with client stakeholders to translate analytical and operational requirements into robust data architectures, while ensuring best practices for data modeling and governance are adhered to.
Responsibilities
Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL
Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers
Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts
Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning
Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications
Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)
Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks
Collaborate onsite with client data engineers, analysts, and business stakeholders to:
Gather technical requirements
Review architecture designs
Validate implementation approaches
Maintain technical documentation covering data flows, transformation logic, table designs, and architectural decisions
Contribute to code reviews, CI/CD practices, and version control workflows to ensure maintainable and production-grade solutions
Qualification
Required
8-12 years in data engineering, analytics engineering, or Distributed data systems
Strong hands-on experience with Databricks Lakehouse Platform
Deep working knowledge of Apache Spark internals, including: Spark SQL, DataFrames/Datasets, Shuffle behavior and execution plans
Advanced Python (PySpark) and SQL development skills
Solid understanding of data warehousing concepts, including: Star and snowflake schemas, Fact/dimension modeling, Analytical vs operational workloads
Experience working with cloud data platforms on AWS, Azure, or GCP
Practical experience with Delta Lake, including: Merge/upsert patterns, Schema enforcement and evolution, Data compaction and optimization
Proficiency with Git-based version control and collaborative development workflows
Strong verbal and written communication skills for client-facing technical discussions
Ability and willingness to work onsite 3 days/week in San Rafael, CA
Preferred
Exposure to Databricks Unity Catalog, data governance, and access control models
Experience with Databricks Workflows, Apache Airflow, or Azure Data Factory for orchestration
Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns
Understanding of data quality frameworks, validation checks, and observability concepts
Experience integrating Databricks with BI tools such as Power BI, Tableau, or Looker
Awareness of cost optimization strategies in cloud-based data platforms
Prior Lifesciences Domain Experience
Company
Incedo Inc.
Incedo is a New Jersey headquartered, digital, data analytics and technology services firm.
H1B Sponsorship
Incedo Inc. has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (108)
2024 (145)
2023 (97)
2022 (114)
2021 (54)
2020 (67)
Funding
Current Stage
Late StageRecent News
Company data provided by crunchbase