Caris Life Sciences · 15 hours ago
Senior Data Engineer
Caris Life Sciences is dedicated to transforming cancer care and improving lives through precision medicine. The Senior Data Engineer will design and maintain scalable data platforms that support analytics and machine learning workflows, collaborating with data scientists and R&D stakeholders.
BiotechnologyArtificial Intelligence (AI)HealthcarePharmaceuticalLife ScienceBiopharmaHealth Care
Responsibilities
Design, build, and maintain scalable, reliable, and secure data pipelines for ingesting, transforming, storing, and serving large, multi-source and multi-omics datasets
Architect and implement cloud-native data solutions on AWS to support analytics workflows, machine learning pipelines, and scientific research
Develop and maintain automation frameworks for data ingestion, processing, validation, and delivery
Build and deploy APIs, services, and data access layers to enable analytics and machine-learning solutions at scale
Develop and deploy applications and workflows in cloud and/or HPC environments, adhering to industry best practices for system architecture, CI/CD, testing, and software design
Partner closely with data scientists, computational biologists, and R&D scientists to design and evolve shared analytics platforms
Optimize data systems for performance, cost efficiency, scalability, and reliability
Ensure data quality, observability, and lineage across pipelines and platforms
Adhere to coding, documentation, security, and compliance standards; manage technical deliverables for assigned projects
Provide general informatics and platform support for laboratory research, technology development, and clinical studies
Contribute to architectural decisions and mentor junior engineers as appropriate
Qualification
Required
Ph.D.'s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
5+ years of professional experience in data engineering, platform engineering, or backend software engineering roles
Strong proficiency in Python and experience building production-grade data pipelines and services
Extensive experience designing and operating data platforms on AWS, including services such as EC2, S3, DynamoDB, EKS/ECS, Lambda, Glue, Athena, and related services
Experience with Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or CDK
Expertise in designing, implementing, and maintaining relational and non-relational databases (e.g., MySQL, PostgreSQL, MongoDB)
Extensive experience with containerization and orchestration technologies
Strong proficiency with Linux and command-line–based workflows
Familiarity with modern data platform concepts, including data lakes, lakehouses, streaming, and batch processing architectures
Experience applying best practices in DevOps, DataOps, and/or MLOps, including CI/CD, monitoring, and automated testing
Strong communication skills and the ability to collaborate effectively with multidisciplinary scientific and engineering teams
Team-oriented mindset with a passion for building robust platforms that enable data-driven discovery and personalized medicine
Preferred
Familiarity with cancer biology concepts, including tumor genomics and molecular profiling workflows
Experience supporting data pipelines for molecular diagnostics, biomarker discovery, or translational research
Working knowledge of common molecular and clinical data types used in oncology research (e.g., NGS-derived data, variant annotations, expression matrices, clinical metadata)
Experience handling high-throughput sequencing–derived data and associated metadata at scale, including ingestion, normalization, and provenance tracking
Understanding of bioinformatics data standards and formats (e.g., FASTQ, BAM/CRAM, VCF, GTF, or similar structured scientific data representations)
Familiarity with public cancer and genomics datasets (e.g., TCGA, COSMIC, cBioPortal, GEO, or equivalent resources)
Experience collaborating closely with computational biologists, bioinformaticians, and cancer researchers to translate research requirements into scalable data platform solutions
Awareness of data quality, reproducibility, and traceability requirements in regulated or clinically adjacent oncology environments
Company
Caris Life Sciences
Caris Life Sciences develops molecular profiling and AI-driven technologies to support precision medicine in oncology.
Funding
Current Stage
Public CompanyTotal Funding
$1.86BKey Investors
BraidwellOrbiMedSixth Street
2025-06-18IPO
2025-04-07Private Equity· $168M
2023-01-19Debt Financing· $400M
Recent News
2026-02-04
2026-02-03
2026-02-03
Company data provided by crunchbase