Cynet Systems ยท 5 days ago
Lead Data Engineer - Remote / Telecommute
Cynet Systems is a company seeking a Lead Data Engineer to design and implement data pipelines and manage data workflows. The role involves optimizing data processes and collaborating with cross-functional teams to ensure data quality and governance.
EmploymentRecruitingStaffing Agency
Responsibilities
Design and implement end-to-end data pipelines using Cloud Dataflow (Python/Apache Beam) for batch and streaming data
Develop, optimize, and maintain BigQuery stored procedures (SPs), SQL scripts, and user-defined functions (UDFs) for complex transformations and business logic implementation
Build and manage data orchestration workflows using Cloud Composer (Airflow) with appropriate operators and dependencies
Establish secure and efficient connections to source systems for data ingestion and integration
Manage data ingestion workflows from on-premise and cloud sources into Google Cloud Storage (GCS) and BigQuery
Execute history data migration from legacy data warehouses (preferably Snowflake, Teradata, Netezza, Oracle, SQL Server) to BigQuery, ensuring accuracy and performance optimization
Design and maintain data validation and testing frameworks for ensuring data quality and reliability across pipelines
Implement data governance practices, including metadata management, lineage tracking, and access control
Collaborate with data analysts and architects to define scalable and reusable data models, views, and semantic layers in BigQuery
Troubleshoot data pipeline failures, perform root cause analysis, and implement preventive measures
Optimize cost and performance of GCP workloads using best practices for BigQuery and Dataflow
Qualification
Required
Strong hands-on experience in Google Cloud Platform (GCP) with expertise in BigQuery, Dataflow, Cloud Composer, and GCS
Proficient in Python programming for data engineering (Dataflow pipelines, validation scripts, automation)
Expertise in BigQuery SQL, Stored Procedures, Views, and User Defined Functions (UDFs)
Experience with data migration from on-premise or cloud data warehouses (preferably Snowflake, Teradata, Netezza, Oracle, SQL Server) to BigQuery
Strong understanding of ETL/ELT frameworks, data modeling, and schema design (Star/Snowflake)
Familiarity with data governance, metadata, and lineage frameworks on GCP
Knowledge of data validation and testing techniques, including reconciliation, rule-based checks, and automation
Hands-on experience in workflow orchestration using Cloud Composer (Airflow) with custom operators
Strong SQL tuning and BigQuery performance optimization skills
Good communication and collaboration skills to work with cross-functional teams in Agile environments
Proven ability to troubleshoot, optimize, and enhance complex data workflows at scale