ML Research Scientist I/II, Multimodal Data Extraction jobs in United States
cer-icon
Apply on Employer Site
company-logo

Lila Sciences · 2 months ago

ML Research Scientist I/II, Multimodal Data Extraction

Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science. As a ML Research Scientist - Multimodal Data Extraction, you will develop foundation models that autonomously read and structure scientific knowledge, contributing to advancements in materials science and chemistry.

Artificial Intelligence (AI)Foundational AILife ScienceSoftware
check
H1B Sponsor Likelynote

Responsibilities

Research and develop AI systems that extract and structure knowledge from diverse scientific sources
Design and fine-tune large language, multi-modal and specialized models for factual, interpretable data extraction
Build scalable pipelines for unstructured and heterogeneous scientific data, integrating text, tables, and visuals
Collaborate with domain experts to align extracted data with real-world discovery workflows
Publish research that advances the state of the art in multimodal understanding and AI-driven knowledge extraction

Qualification

Machine LearningNLPVision-Language ModelingPyTorchHugging Face TransformersLarge Language ModelsMultimodal ModelsData StructuresPassion for AICollaborative Mindset

Required

PhD (or equivalent research experience) in Computer Science, Chemistry, Materials Science, or related field
Expertise in machine learning, NLP, and vision–language modeling using PyTorch and Hugging Face Transformers
Proven ability to train, fine-tune, and evaluate LLMs and multimodal models for scientific data extraction
Strong understanding of data structures and representations used in the physical sciences
Demonstrated research impact through publications, preprints, or open-source work (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, Scientific Journals)

Preferred

Experience with multimodal fusion architectures and document-level understanding
Knowledge of scientific document parsing (OCR, table extraction, figure-caption linking)
Familiarity with knowledge graph construction or reasoning systems for science
Experience with noisy or heterogeneous real-world scientific data
Collaborative mindset and passion for advancing AI in the physical sciences

Benefits

Bonus potential
Generous early equity

Company

Lila Sciences

twittertwittertwitter
company-logo
Lila Sciences creates a scientific superintelligence platform and autonomous labs for life sciences, chemistry, and materials science. It is a sub-organization of Flagship Pioneering.

H1B Sponsorship

Lila Sciences has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8)

Funding

Current Stage
Growth Stage
Total Funding
$550M
Key Investors
NVenturesFlagship Pioneering
2025-10-14Series A· $115M
2025-09-14Series A· $235M
2025-03-10Seed· $200M
Company data provided by crunchbase