84 applicants

Company

TechDemocracy · 7 hours ago

GCP Data Engineer (W2 Only)

United States

Contract

Remote

Mid, Senior Level

5+ years exp

Maximize your interview chances

ConsultingCyber Security

H1B Sponsor Likely

Hiring Manager

Vijay Adepu

Insider Connection @TechDemocracy

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Design, develop, and implement scalable, high-performance data solutions on GCP

Curate and manage a comprehensive data set detailing user permissions and group memberships

Redesign the existing data pipeline to improve scalability and reduce processing time

Ensure that changes to data access permissions are reflected in the Tableau dashboard within 24 hours

Collaborate with technical and business users to share and manage data sets across multiple projects

Utilize GCP tools and technologies to optimize data processing and storage

Re-architect the data pipeline that builds the BigQuery dataset used for GCP IAM dashboards to make it more scalable

Run and customize DLP scans

Build bidirectional integrations between GCP and Collibra

Explore and potentially implement Dataplex and custom format-preserving encryption for de-identifying data for developers in lower environments

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Google Cloud Platform (GCP)BigQueryPythonData EngineeringGoogle's IAM APILinux/UnixHDFSSparkSQLShell scriptingAirflowPub/SubCloud StorageDataflowDataprocComposerGitHubUnit testingJenkinsTerraformScalaKafkaFlumeRESTful APIsSOAP APIsAvroParquetJSONJiraConfluence

Required

Bachelor's degree in Computer Engineering or a related field

5+ years of experience in an engineering role using Python, Java, Spark, and SQL

5+ years of experience working as a Data Engineer in GCP

Proficiency with Google’s Identity and Access Management (IAM) API

Strong Linux/Unix background and hands-on knowledge

Experience with big data technologies such as HDFS, Spark, Impala, and Hive

Experience with Shell scripting and bash

Experience with version control platforms like GitHub

Experience with unit testing code

Experience with development ecosystems including Jenkins, Artifactory, CI/CD, and Terraform

Demonstrated proficiency with Airflow

Excellent written and verbal communication skills

Ability to understand and analyze complex data sets

Ability to exercise independent judgment on moderately complex issues

Ability to make recommendations to management on new processes, tools, and techniques

Ability to work under minimal supervision and use independent judgment requiring analysis of variable factors

Ability to collaborate with senior professionals in the development of methods, techniques, and analytical approaches

Ability to advise management on approaches to optimize for data platform success

Ability to effectively communicate highly technical information to various audiences, including management, the user community, and less-experienced staff

Proficiency in multiple programming languages, frameworks, domains, and tools

Coding skills in Scala

Experience with GCP platform development tools such as Pub/Sub, Cloud Storage, Bigtable, BigQuery, Dataflow, Dataproc, and Composer

Knowledge in Hadoop and cloud platforms and surrounding ecosystems

Experience with web services and APIs (RESTful and SOAP)

Ability to document designs and concepts

API Orchestration and Choreography for consumer apps

Well-rounded technical expertise in Apache packages and hybrid cloud architectures

Pipeline creation and automation for data acquisition

Metadata extraction pipeline design and creation between raw and transformed datasets

Quality control metrics data collection on data acquisition pipelines

Ability to collaborate with scrum teams including scrum master, product owner, data analysts, Quality Assurance, business owners, and data architecture to produce the best possible end products

Experience contributing to and leveraging Jira and Confluence

Strong experience working with real-time streaming applications and batch-style large-scale distributed computing applications using tools like Spark, Kafka, Flume, Pub/Sub, and Airflow

Ability to work with different file formats like Avro, Parquet, and JSON

Managing and scheduling batch jobs

Hands-on experience in Analysis, Design, Coding, and Testing phases of the Software Development Life Cycle (SDLC)

Company

TechDemocracy

Global IT expert consulting company

Founded in 2000

Piscataway, New Jersey, USA

201-500 employees

http://www.techdemocracy.com

H1B Sponsorship

TechDemocracy has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2023 (14)

2022 (34)

2021 (91)

2020 (136)