Data Engineer, Knowledge Graphs jobs in United States
cer-icon
Apply on Employer Site
company-logo

Mithrl · 1 month ago

Data Engineer, Knowledge Graphs

Mithrl is building the world’s first commercially available AI Co-Scientist, transforming messy biological data into insights. The Data Engineer, Knowledge Graphs will build infrastructure for the biological knowledge layer, focusing on ETL pipelines, schema design, and API creation to support the platform's needs.

Artificial Intelligence (AI)Data Center AutomationLife ScienceMedicalSoftware

Responsibilities

Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources
Design, implement, and evolve schemas and storage models for graph structured biological data
Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics
Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings
Build data models that support multi tenant access, versioning, and reproducibility across releases
Implement scalable storage and indexing strategies for high volume graph data
Maintain data quality, validate data integrity, and build monitoring around ingestion and usage
Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications
Support data warehousing, documentation, and API reliability
Ensure performance, reliability, and uptime for knowledge graph services

Qualification

ETL pipelinesDatabase designGraph data modelsPythonAPI designCloud infrastructureData architectureCommunicationTeam collaboration

Required

Strong experience as a data engineer or backend engineer working with data intensive systems
Experience building ETL or ELT pipelines for large structured or semi structured datasets
Strong understanding of database design, schema modeling, and data architecture
Experience with graph data models or willingness to learn graph storage concepts
Proficiency in Python or similar languages for data engineering
Experience designing and maintaining APIs for data access
Understanding of versioning, provenance, validation, and reproducibility in data systems
Experience with cloud infrastructure and modern data stack tools
Strong communication skills and ability to work closely with scientific and engineering teams

Preferred

Experience with graph databases or graph query languages
Experience with biological or chemical data sources
Familiarity with ontologies, controlled vocabularies, and metadata standards
Experience with data warehousing and analytical storage formats
Previous work in a tech bio company or scientific platform environment

Benefits

Comprehensive PPO health coverage through Anthem (medical, dental, and vision)
401(k) with top-tier plans

Company

Mithrl

twittertwittertwitter
company-logo
Mithrl is a software development company that builds the custom workflows for NGS data on-demand.

Funding

Current Stage
Early Stage
Total Funding
$4M
Key Investors
Bonfire Ventures
2024-11-14Seed· $4M

Leadership Team

leader-logo
Shara Balakrishnan, Ph.D.
Chief Technology Officer
linkedin
Company data provided by crunchbase