200+ applicants

Company

Ambry Genetics · 18 hours ago

Principal Data Engineer- Remote, USA

United States

Full-time

Remote

Senior Level

$180K/yr - $200K/yr

6+ years exp

Maximize your interview chances

Health CareHealth Diagnostics

H1B Sponsor Likely

Hiring Manager

Noah Kaufman

Insider Connection @Ambry Genetics

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Build Kafka connectors to sync updates from source data stores

Build partitioned Kafka topics to sync updates to destination data marts

Build multiplexed data analytics workloads using Apache Flink to monitor streaming metrics and perform real-time data transformations

Build dashboards using Datadog and Cloudwatch to ensure system health and user support

Build opinionated but accommodating schema registries that ensure data governance

Work closely with your West Coast based scrum team to submit and review PRs daily, maintain documentation and backlogs, validate builds across multiple environments, and deploy at a 2–4-week sprint cadence

Design reasonable database schemas with query access patterns as the forethought Build and maintain CI/CD pipelines using infrastructure-as-code

Iteratively migrate on-prem ETL jobs written in PHP into AWS Flink and Glue processes Partner with QA Engineers in building automated test suites

Partner with end-users to resolve service disruptions and evangelize our data product offerings Vigilantly oversee data quality and alert upstream data producers of all disparities, latency, and defects

Develop and maintain the overall data platform architecture strategy, roadmap, and implementation plans to support the company's data-driven initiatives and business objectives.

Design and implement scalable, secure, and high-performance data architectures, including data warehouses, data lakes, and data pipelines, leveraging both on-premises and cloud technologies.

Establish data governance policies, standards, and best practices for data management, data quality, data security, and data privacy across the organization.

Lead the development and implementation of real-time data streaming solutions, including event-driven architectures, data ingestion, transformation, and consumption using technologies like Apache Kafka, Apache Flink, and AWS Managed Streaming for Kafka (MSK).

Oversee the creation and maintenance of Business Intelligence (BI) platforms, data visualization tools, and self-service analytics capabilities to enable data-driven decision-making across the organization.

Lead and manage a team of data engineers, database administrators, and data analysts, fostering their professional growth, promoting best practices, and ensuring adherence to organizational standards and processes

Other duties as assigned

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Apache KafkaAWS KinesisApache FlinkData GovernanceData WarehousesPythonAWS GlueDockerGitReal-time Data StreamsData APIsPHPMySQLTerraformAWS LambdaAtlassian ProductsSystem Diagramming ToolsScrumHANA/4RedisJavascriptPHP MVC FrameworksData Visualization ToolsCloud TechnologiesGenomic ConceptsAWS Associate CertificationAWS Data Engineer Certification

Required

Basic understanding of genomic concepts and terminology

Experience with PyFlink

Experience with AWS Kinesis

Willing to work PST hours between 8:00 AM - 5:00 PM or 9:00 AM – 6:00 PM

Strong familiarity with any combination of our tech stacks in order of importance: Apache Kafka (MSK flavor preferred), Debezium, Python, Apache Flink or PySpark Streaming, MySQL (RDS flavors preferred), Python, CDK or Terraform, Athena, Glue, Lambda, Appflow, HANA/4, PHP, Redis, Docker, Javascript

Experience building data APIs and offering Data as a Service

Experience integrating with SaaS platforms such as SAP and Salesforce

Experience or willingness to learn working with PHP MVC frameworks such as Symfony

Experience with Atlassian products, i.e. Jira, Confluence, Bamboo

Experience with system diagramming tools such as Miro, LucidCharts, or Visio

6+ years’ experience working with professional scrum teams and/or equivalent schooling

4+ years’ experience using Git versioning control

3+ years’ experience designing and indexing relational databases

2+ years’ experience building and operationalizing real-time data streams

Bachelor’s or master’s degree in computer, data, math, or life sciences or equivalent work experience

Preferred

AWS Associate Solution Architect certification

AWS Data Engineer certification

Benefits

Medical

Dental

Vision

401k with a 4% employer match

FSA

Paid sick leave

Generous paid time off (PTO) program

Company

Ambry Genetics

Glassdoor

3.1

Ambry leads in clinical genetic diagnostics and genetics software solutions.

Founded in 1999

Aliso Viejo, California, USA

501-1,000 employees

http://www.ambrygen.com/

H1B Sponsorship

Ambry Genetics has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2023 (2)

2022 (3)

2021 (6)

2020 (8)