SIGN IN
Senior Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Caris Life Sciences · 7 hours ago

Senior Data Engineer

Caris Life Sciences is dedicated to transforming cancer care and improving lives through precision medicine. The Senior Data Engineer will design and maintain scalable data platforms that support analytics and machine learning workflows, collaborating with data scientists and R&D stakeholders.
BiotechnologyArtificial Intelligence (AI)HealthcarePharmaceuticalLife ScienceBiopharmaHealth Care

Responsibilities

Design, build, and maintain scalable, reliable, and secure data pipelines for ingesting, transforming, storing, and serving large, multi-source and multi-omics datasets
Architect and implement cloud-native data solutions on AWS to support analytics workflows, machine learning pipelines, and scientific research
Develop and maintain automation frameworks for data ingestion, processing, validation, and delivery
Build and deploy APIs, services, and data access layers to enable analytics and machine-learning solutions at scale
Develop and deploy applications and workflows in cloud and/or HPC environments, adhering to industry best practices for system architecture, CI/CD, testing, and software design
Partner closely with data scientists, computational biologists, and R&D scientists to design and evolve shared analytics platforms
Optimize data systems for performance, cost efficiency, scalability, and reliability
Ensure data quality, observability, and lineage across pipelines and platforms
Adhere to coding, documentation, security, and compliance standards; manage technical deliverables for assigned projects
Provide general informatics and platform support for laboratory research, technology development, and clinical studies
Contribute to architectural decisions and mentor junior engineers as appropriate

Qualification

AWSPythonData engineeringData architectureInfrastructure as CodeContainerizationLinuxCommunication skillsTeam-oriented mindsetCollaboration

Required

Ph.D.'s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
5+ years of professional experience in data engineering, platform engineering, or backend software engineering roles
Strong proficiency in Python and experience building production-grade data pipelines and services
Extensive experience designing and operating data platforms on AWS, including services such as EC2, S3, DynamoDB, EKS/ECS, Lambda, Glue, Athena, and related services
Experience with Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or CDK
Expertise in designing, implementing, and maintaining relational and non-relational databases (e.g., MySQL, PostgreSQL, MongoDB)
Extensive experience with containerization and orchestration technologies
Strong proficiency with Linux and command-line–based workflows
Familiarity with modern data platform concepts, including data lakes, lakehouses, streaming, and batch processing architectures
Experience applying best practices in DevOps, DataOps, and/or MLOps, including CI/CD, monitoring, and automated testing
Strong communication skills and the ability to collaborate effectively with multidisciplinary scientific and engineering teams
Team-oriented mindset with a passion for building robust platforms that enable data-driven discovery and personalized medicine

Preferred

Familiarity with cancer biology concepts, including tumor genomics and molecular profiling workflows
Experience supporting data pipelines for molecular diagnostics, biomarker discovery, or translational research
Working knowledge of common molecular and clinical data types used in oncology research (e.g., NGS-derived data, variant annotations, expression matrices, clinical metadata)
Experience handling high-throughput sequencing–derived data and associated metadata at scale, including ingestion, normalization, and provenance tracking
Understanding of bioinformatics data standards and formats (e.g., FASTQ, BAM/CRAM, VCF, GTF, or similar structured scientific data representations)
Familiarity with public cancer and genomics datasets (e.g., TCGA, COSMIC, cBioPortal, GEO, or equivalent resources)
Experience collaborating closely with computational biologists, bioinformaticians, and cancer researchers to translate research requirements into scalable data platform solutions
Awareness of data quality, reproducibility, and traceability requirements in regulated or clinically adjacent oncology environments

Company

Caris Life Sciences

company-logo
Caris Life Sciences develops molecular profiling and AI-driven technologies to support precision medicine in oncology.

Funding

Current Stage
Public Company
Total Funding
$1.86B
Key Investors
BraidwellOrbiMedSixth Street
2025-06-18IPO
2025-04-07Private Equity· $168M
2023-01-19Debt Financing· $400M

Leadership Team

leader-logo
Luke Power
Chief Financial Officer
linkedin
leader-logo
Brian Stengle
SVP, Chief Marketing Officer
linkedin
Company data provided by crunchbase