Apply on Employer Site

Lila Sciences · 2 months ago

Machine Learning Scientist I/II, Medicinal Chemistry & Lead Optimization

Cambridge, MA USA

Full-time

Onsite

Entry, Mid Level

$176K/yr - $304K/yr

Lila Sciences is pioneering a new age of boundless discovery by building capabilities to apply AI to every aspect of the scientific method. The Machine Learning Scientist I/II will join the Drug Discovery group to develop AI tools that optimize medicinal chemistry processes and improve candidate quality through ligand-based modeling and data-driven design strategies.

Artificial Intelligence (AI)Life ScienceSoftware

H1B Sponsor Likely

Responsibilities

Develop multi-task and transfer-learned models for potency, selectivity, and developability using graph/message-passing, and conformer-aware features

Build models that learn from HTS, DEL, and follow-up assays; robust curve-fitting, plate/batch effect correction, dose–response QC, and time-split/scaffold-split evaluations to ensure prospective reliability

Create active learning and Bayesian optimization strategies to propose the next best analogs under multi-parameter objectives

Integrate template-based and template-free retrosynthesis with reaction prediction, condition and yield modeling, building-block availability, and cost/time/risk scoring

Build BRICS/RECAP/fragment-linking enumerations and property-conditioned generative models that respect synthetic constraints and matched molecular pair rules for local SAR exploration and scaffold hopping

Automate MMP analysis, local SAR maps, and substructure attributions to surface chemist-actionable insights; link assay deltas to specific modifications and highlight potential bioisosteres and de-risking moves

Establish cheminformatics pipelines for standardization, deduplication, structure normalization, and assay/ELN/LIMS ingestion; define ontologies and metadata for traceability and reproducibility

Design leakage-safe splits, conformal prediction for calibrated decisions, and prospective tests. Ship APIs and tools that integrate with medchem workflows, procurement, and automated synthesis

Work closely with medicinal chemists, DMPK, biology, and automation to translate TPPs into modeling objectives and to operationalize model recommendations in real make–test cycles

Qualification

PythonLigand-based modelingMedicinal chemistry principlesCheminformatics toolsRetrosynthesis planningActive learningData foundationsDemonstrated industry experienceSelf-starterAttention to detailClear communicationCollaboration skills

Required

Strong proficiency in Python and modern ML (PyTorch/JAX/TF, scikit-learn, XGBoost/CatBoost), with experience training at scale and deploying end-to-end pipelines

Deep experience in ligand-based modeling (QSAR/QSPR, multi-task learning, uncertainty and applicability domain, calibration) and ADMET prediction for medicinal chemistry

Solid grasp of medicinal chemistry principles: SAR development, bioisosteres, property tuning (pKa/logD/PSA), selectivity design, and liability mitigation (CYP, hERG, reactivity, permeability, solubility)

Cheminformatics and data tooling: RDKit, Chemprop/DeepChem, conformer generation, fingerprints/descriptors, ELN/LIMS integration, and assay data processing/curve-fitting

Retrosynthesis and synthesis planning: Familiarity with template-based/template-free methods, route scoring, reaction/yield/condition prediction, building block catalogs, and makeability constraints

Active learning and design-of-experiments: Bayesian optimization, diversity sampling, and portfolio-aware selection under experimental and synthesis budgets

Ability to design rigorous, leakage-controlled benchmarks and prospective validations; experience with scaffold/time splits and activity-cliff-aware evaluation

Strong self-starter with excellent attention to detail and clear communication; able to collaborate tightly with chemists and biologists

Demonstrated industry experience or academic achievement

Preferred

PhD in Chemoinformatics, Medicinal Chemistry, Computational Chemistry, Computer Science, or related field with a strong publication record in ML/drug discovery venues

Experience building synthesis-aware generative models and integrating retrosynthesis into design loops; familiarity with tools like ASKCOS/AiZynth-style planners or equivalent

Track record improving DMTA cycle time and MPO outcomes in live programs; integration with procurement and automated synthesis platforms

Expertise with MMPA, activity-cliff handling, conformal prediction, and applicability-domain diagnostics in production

Experience triaging HTS/DEL data, PAINS/aggregator/covalent liability filters, and off-target/polypharmacology prediction

MLOps for cheminformatics: data versioning, experiment tracking, model serving/monitoring, and cloud/HPC scaling

Benefits

Bonus potential

Generous early equity

Company

Lila Sciences

Lila Sciences creates a scientific superintelligence platform and autonomous labs for life sciences, chemistry, and materials science. It is a sub-organization of Flagship Pioneering.

Founded in 2023

Cambridge, Massachusetts, USA

201-500 employees

https://www.lila.ai

H1B Sponsorship

Lila Sciences has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (8)

Funding

Current Stage

Growth Stage

Total Funding

$550M

Key Investors

NVenturesFlagship Pioneering

2025-10-14Series A· $115M

2025-09-14Series A· $235M

2025-03-10Seed· $200M

Recent News

Axios

2025's AI-fueled scientific breakthroughs

2026-01-03

masslive

Here’s what happened to Mass. startup investing in 2025

2025-12-17

MIT Technology Review

AI materials discovery now needs to move into the real world

2025-12-15

Company data provided by crunchbase