Helix · 3 hours ago
Senior Genomics Data Engineer
Helix is a company dedicated to transforming healthcare through genomics. They are seeking a Senior Genomics Data Engineer to develop innovative solutions for simplifying complex genomic data and optimizing data pipelines for large-scale genomic and clinical data processing.
Big DataBioinformaticsBiotechnologyBlockchainCryptocurrencyFinTechGeneticsHealth CarePersonal Health
Responsibilities
Leverage your specialized knowledge to develop innovative solutions that simplify complex genomic data, effectively lowering the barrier to entry for non-expert users
Design, build, and continuously optimize robust, scalable, and automated data pipelines for processing large-scale genomic and clinical data
Build and maintain critical pipelines to prepare, de-identify, and securely deliver massive-scale genetic datasets to both internal research teams and external partners
Work cross-functionally with world-class Engineering, Research, AI, Data Science, Bioinformatics, Product, and Commercial teams to tackle complex data challenges and drive scientific discovery
Provide expert-level support and create tooling to help internal and external data consumers effectively utilize our complex datasets and platforms
Implement and manage data infrastructure as code using tools like AWS CDK, ensuring our distributed compute environment is efficient and scalable
Qualification
Required
Bachelor's/Master's degree in Computer Science, Bioinformatics, Engineering or a related field with 5+ years of experience
Deep domain knowledge in molecular biology, next-generation sequencing, or genomics
Demonstrated experience in processing a variety of large scale genetic data formats (exome/whole genome), including but not limited to VCF, CRAM, BAM, and PLINK
Strong experience using industry-standard bioinformatics tools such as bcftools, htslib, and samtools
Experience with genomic data-reduction techniques, such as PCA
Expert-level proficiency in Python
Proven experience designing and building distributed systems on AWS, including expertise with services like Glue, EMR, S3, Lambda, and DynamoDB
Proficiency with infrastructure-as-code frameworks (e.g., AWS CDK, Terraform)
Expertise with ETL pipeline automation and workflow management tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD
Familiarity with database design, data manipulation, and data quality techniques
Demonstrated ability to thrive in a fast-paced, adaptable environment
Preferred
Background in the bioinformatics or healthcare industries, and familiarity with clinical data
Proficiency in Go, Java, C, C++, or Scala
Hands-on skills with genomics-specific data tools such as Hail or TileDB
Track records of working in a regulated data environment (e.g., HIPAA)
Hands-on experience designing and building distributed systems on AWS, frameworks such as Spark, Dask, EMR, Databricks, or similar
Benefits
Comprehensive Health Insurance with Date of Hire eligibility
12 weeks Helix Paid Parental Leave option
401(k) with employer matching of up to 3% and 100% Vesting on the Date of Hire
Comprehensive Well-Being Benefits
Flexible PTO
Remote options for many roles and a home office stipend
Company
Helix
Helix is a population genomics company that is working to advance genomics research and integrate genomic data into clinical care.
H1B Sponsorship
Helix has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (4)
2024 (3)
2023 (3)
2022 (1)
Funding
Current Stage
Growth StageTotal Funding
$403MKey Investors
Warburg PincusNational Institutes of HealthDFJ Growth
2021-06-03Series C· $50M
2020-07-31Grant· $33M
2018-03-01Series B· $200M
Recent News
Pharmaceutical Technology
2026-01-15
2026-01-08
Company data provided by crunchbase