Apply on Employer Site

Ginkgo Bioworks, Inc. · 1 day ago

Senior Software Engineer, Data Pipelines

Boston, Massachusetts

Full-time

Onsite

Senior Level

$134K/yr - $190K/yr

7+ years exp

Ginkgo Bioworks is dedicated to making biology easier to engineer, focusing on biosecurity infrastructure to address biological threats. The role involves building and operating critical biosecurity data systems, designing reliable data pipelines, and ensuring data quality across various programs.

BiopharmaBiotechnologyChemical

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Plan, architect, test, and deploy data warehouses, data marts, and ETL/ELT pipelines primarily within AWS and Snowflake environments

Build scalable data pipelines capable of handling structured, unstructured, and high-throughput biological data from diverse sources

Develop data models using dbt with rigorous testing, documentation, and stakeholder-aligned semantics to ensure analytics-ready datasets

Ensure data integrity, consistency, and accessibility across internal and external biosecurity data products

Develop, document, and enforce coding and data modeling standards to improve code quality, maintainability, and system performance

Serve as the in-house data expert, making recommendations on data architecture, pipeline improvements, and best practices; define and adapt data engineering processes to deliver reliable answers to critical biosecurity questions

Build high-performance APIs and microservices in Python that enable seamless integration between the biosecurity data platform and user-facing applications

Design backend services that support real-time and batch data access for biosecurity operations

Create data products that empower public health officials, analysts, and partners with actionable biosecurity intelligence

Democratize access to complex biosecurity datasets using AI and LLMs, making data more discoverable and usable for stakeholders

Apply AI-assisted development tools to accelerate code generation, data modeling, and pipeline development while maintaining high quality standards

Build robust, production-ready data workflows using AWS, Kubernetes, Docker, Airflow, and infrastructure-as-code (Terraform/CloudFormation)

Diagnose system bottlenecks, optimize for cost and speed, and ensure the reliability and fault tolerance of mission-critical data pipelines

Implement observability, monitoring, and alerting to maintain high availability for biosecurity operations

Lead data projects from scoping through execution, including design, documentation, and stakeholder communication

Collaborate with technical leads, product managers, scientists, and data analysts to build robust data products and analytics capabilities

Qualification

Data EngineeringSQLPythonETL/ELT PipelinesCloud Data WarehousingDbtAWSAirflowSoftware Engineering FundamentalsData QualityAPIsKubernetesDockerData AnalysisCommunication SkillsObservability

Required

7+ years of professional experience in data or software engineering, with a focus on building production-grade data products and scalable architectures

Expert proficiency with SQL for complex transformations, performance tuning, and query optimization

Strong Python skills for data engineering workflows, including pipeline development, ETL/ELT processes, and data processing; experience with backend frameworks (FastAPI, Flask) for API development; focus on writing modular, testable, and reusable code

Proven experience with dbt for data modeling and transformation, including testing frameworks and documentation practices

Hands-on experience with cloud data warehouses (Snowflake, BigQuery, or Redshift), including performance tuning, security hardening, and managing complex schemas

Experience with workflow orchestration tools (Airflow, Dagster, or equivalent) for production data pipelines, including DAG development, scheduling, monitoring, and troubleshooting

Solid grounding in software engineering fundamentals: system design, version control (Git), CI/CD pipelines, containerization (Docker), and infrastructure-as-code (Terraform, CloudFormation)

Hands-on experience managing AWS resources, including S3, IAM roles/policies, API integrations, and security configurations

Strong ability to analyze large datasets, identify data quality issues, debug pipeline failures, and propose scalable solutions

Excellent communication skills and ability to work cross-functionally with scientists, analysts, and product teams to turn ambiguous requirements into maintainable data products

Preferred

Domain familiarity with biological data (PCR, sequencing, wastewater surveillance, TAT metrics) and experience working with lab, bioinformatics, NGS, or epidemiology teams

Production ownership of Snowflake environments including RBAC, secure authentication patterns, and cost/performance optimization

Experience with observability and monitoring stacks (Grafana, Datadog, or similar) and data quality monitoring (anomaly detection, volume/velocity checks, schema drift detection)

Familiarity with container orchestration platforms (Kubernetes) for managing production workloads

Experience with data ingestion frameworks (Airbyte, Fivetran) or building custom ingestion solutions for external partner data delivery

Familiarity with data cataloging, governance practices, and reference data management to prevent silent data drift

Experience designing datasets for visualization tools (Tableau, Looker, Metabase) with strong understanding of dashboard consumption patterns; familiarity with JavaScript for custom visualizations or front-end dashboard development

Comfort with AI-assisted development tools (GitHub Copilot, Cursor) to accelerate code generation while maintaining quality standards

Startup or fast-paced environment experience with evolving priorities and rapid iteration

Scientific or data-intensive domain experience (life sciences, healthcare, materials science)

Benefits

Company stock awards

Comprehensive benefits package including medical, dental & vision coverage

Health spending accounts

Voluntary benefits

Leave of absence policies

401(k) program with employer contribution

8 paid holidays in addition to a full-week winter shutdown

Unlimited Paid Time Off policy

Company

Ginkgo Bioworks, Inc.

Glassdoor4.1

At Ginkgo, we use biology to grow the future.

Founded in 2008

Boston, Massachusetts, USA

501-1000 employees

http://ginkgobioworks.com

H1B Sponsorship

Ginkgo Bioworks, Inc. has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (13)

2024 (38)

2023 (25)

2022 (27)

2021 (27)

2020 (8)

Funding

Current Stage

Public Company

Total Funding

$1.58B

Key Investors

Bill & Melinda Gates FoundationCenters for Disease Control and PreventionAgriculture and Food Research Initiative

2024-04-10Grant

2023-12-13Grant

2023-10-05Grant

Leadership Team

Reshma Shetty

Co-founder

Austin Che

Founder

Recent News

Research & Development World

OpenAI’s GPT-5 autonomously ran 36,000 protein synthesis experiments in Ginkgo Bioworks’ cloud lab

2026-02-06

Investing.com

Ginkgo Bioworks stock rises after AI lab collaboration with OpenAI shows 40% cost reduction

2026-02-06

EIN Presswire

Feed Enzymes Market to Hit USD 2.38 Billion by 2034, Growing at 5.45% CAGR (2026–2034)

2026-02-05

Company data provided by crunchbase