CloudIngest ยท 3 days ago
Azure Data Factory Engineer (W2 Only)
CloudIngest is seeking a highly skilled Data Engineer with expertise in modern data platforms and advanced analytics. The ideal candidate will design, build, and optimize data pipelines, ensuring scalable and reliable data solutions across cloud and big data ecosystems.
Responsibilities
Design and implement robust data ingestion pipelines using Azure Data Factory (ADF)
Apply ADF best practices for scalability, monitoring, and error handling
Develop and optimize PySpark jobs, focusing on efficient joins, junctions, and skew mitigation
Identify and resolve data skew issues to improve performance in distributed environments
Implement schema drift handling strategies to ensure pipeline resilience
Build and maintain star schema models for analytical workloads
Work with Azure SQL and Snowflake to design scalable data warehouses
Manage data cleaning and transformation processes for high-quality datasets
Leverage TensorFlow and contextual/model embeddings to integrate machine learning models into data pipelines
Implement bias detection and transformation techniques to ensure fairness in data-driven models
Identify data clusters in Databricks without impacting performance, enabling advanced analytics
Partner with data scientists, analysts, and business stakeholders to deliver actionable insights
Ensure compliance with data governance, security, and privacy standards
Document processes, pipelines, and best practices for knowledge sharing
Qualification
Required
Proven experience with Azure Data Factory (ADF), including ingestion pipelines and best practices
Strong proficiency in Azure SQL, PySpark, and Databricks
Expertise in Snowflake data warehousing and Star Schema design
Hands-on experience with TensorFlow for embedding and model integration
Deep understanding of Data Skew issues and optimization strategies
Knowledge of Schema Drift handling and bias transformation techniques
Strong background in Data Cleaning, Contextual Embedding, Model Embedding, and Advanced Analytics Integration
Bachelor's or Master's degree in Computer Science, Data Engineering, or related field
5+ years of experience in data engineering roles with cloud and big data platforms
Demonstrated ability to optimize large-scale distributed data systems
Excellent problem-solving, and collaboration skills
Must have excellent communication