Galent · 5 hours ago
Data Engineer
Galent is seeking a Data Engineer to build and maintain the pipelines and datasets required to operationalize Visa’s machine learning models. The role involves ingestion, transformation, and preparation of data, as well as collaboration with data architects and scientists to ensure accurate feature engineering pipelines.
Responsibilities
Build and maintain the pipelines and datasets required to operationalize Visa’s machine learning models in the customer environment
Ingestion, transformation, and preparation of feature ready data in Databricks, as well as cross platform data movement from Snowflake
Build scalable PySpark/Spark SQL pipelines in Databricks to prepare curated, feature ready datasets for model development and scoring
Develop and manage data ingestion and harmonization pipelines
Implement transformations, validations, and schema alignment based on model requirements
Optimize pipeline performance, scheduling, and orchestration (Jobs, Workflows, Delta Live Tables)
Collaborate with Solution Architect, data architects and scientists to ensure accurate feature engineering pipelines
Integrate and harmonize data from Snowflake into Databricks, ensuring schema alignment and data consistency
Prepare curated, feature-ready datasets that meet Visa’s feature definitions and modelling requirements
Monitor pipeline health, data freshness, and quality, escalating issues early
Follow best practices for versioning, CI/CD, and operational MLOps workflows
Work closely with Data Scientists to translate feature logic into efficient, reusable pipelines
Implement orchestration and scheduling using Databricks Jobs/Workflows or Delta Live Tables
Monitor, optimize, and troubleshoot pipelines for performance, data quality, and SLAs
Ensure secure access patterns and compliance with customer data governance requirements
Support Dev/QA/Prod deployment of feature pipelines
Qualification
Required
4–7 years of hands-on experience in data engineering with cloud data platforms
Strong experience with Databricks, PySpark, Spark SQL, and Delta Lake
Proficiency with Snowflake and cross-platform data integration
Experience building large-scale feature engineering pipelines for ML workloads
Strong SQL skills and experience with distributed data processing
Understanding of MLOps concepts, pipeline versioning, and CI/CD best practices
Strong understanding of Azure cloud platform, IAM, and enterprise security practices
Company
Galent
Galent is an AI-native digital engineering firm at the forefront of the AI revolution, dedicated to delivering unified, enterprise-ready AI solutions that transform businesses and industries.
Funding
Current Stage
Late StageCompany data provided by crunchbase