DeepRec.ai · 22 hours ago
Data Scientist
Responsibilities
Build ETL pipelines for large molecular, electrochemical and battery datasets
Deliver APIs and data services to ML and simulation teams
Apply QC, schemas and metadata to ensure ML-ready data
Run EDA to surface gaps and improve model performance
Work closely with scientists and ML researchers
Qualification
Required
2+ years in data science, data engineering or scientific computing
Strong Python + SQL
Experience with chemical, pharma, energy or materials data
MS/PhD (or equivalent industry experience) in chemistry, materials or chemical engineering
Preferred
RDKit, OpenBabel, molecular search, LLM data pipelines
Battery or electrochemistry data
Airflow, NoSQL, chemical APIs