Golden Technology ยท 4 hours ago
Data Engineer with GCP
Maximize your interview chances
Insider Connection @Golden Technology
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Provide Technical Leadership: Offer technical leadership to ensure clarity between ongoing projects and facilitate collaboration across teams to solve complex data engineering challenges.
Build and Maintain Data Pipelines: Design, build, and maintain scalable, efficient, and reliable data pipelines to support data ingestion, transformation, and integration across diverse sources and destinations, using tools such as Kafka, Databricks, and similar toolsets.
Drive Digital Innovation: Leverage innovative technologies and approaches to modernize and extend core data assets, including SQL-based, NoSQL-based, cloud-based, and real-time streaming data platforms.
Implement Feature Engineering: Develop and manage feature engineering pipelines for machine learning workflows, utilizing tools like Vertex AI, BigQuery ML, and custom Python libraries.
Implement Automated Testing: Design and implement automated unit, integration, and performance testing frameworks to ensure data quality, reliability, and compliance with organizational standards.
Optimize Data Workflows: Optimize data workflows for performance, cost efficiency, and scalability across large datasets and complex environments.
Mentor Team Members: Mentor team members in data principles, patterns, processes, and practices to promote best practices and improve team capabilities.
Draft and Review Documentation: Draft and review architectural diagrams, interface specifications, and other design documents to ensure clear communication of data solutions and technical requirements.
Cost/Benefit Analysis: Present opportunities with cost/benefit analysis to leadership, guiding sound architectural decisions for scalable and efficient data solutions.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
4+ years of professional Data Development experience.
4+ years of experience with SQL and NoSQL technologies.
3+ years of experience building and maintaining data pipelines and workflows.
5+ years of experience developing with Java.
2+ years of experience developing with Python.
3+ years of experience developing Kafka solutions.
2+ years of experience in feature engineering for machine learning pipelines.
Experience with GCP services such as BigQuery, Vertex AI Platform, Cloud Storage, AutoMLOps, and Dataflow.
Experience with CI/CD pipelines and processes.
Experience with automated unit, integration, and performance testing.
Experience with version control software such as Git.
Full understanding of ETL and Data Warehousing concepts.
Strong understanding of Agile principles (Scrum).
Preferred
Knowledge of Structured Streaming (Spark, Kafka, EventHub, or similar technologies).
Experience with GitHub SaaS/GitHub Actions.
Experience understanding Databricks concepts.
Experience with PySpark and Spark development.