ML Research Scientist I/II, Multimodal Data Extraction jobs in United States
cer-icon
Apply on Employer Site
company-logo

Lila Sciences · 1 day ago

ML Research Scientist I/II, Multimodal Data Extraction

Lila Sciences is pioneering a scientific superintelligence platform aimed at enhancing discovery in life sciences, chemistry, and materials science. As an ML Research Scientist, you will develop AI systems to extract and structure scientific knowledge from diverse sources, contributing to advancements in multimodal understanding and autonomous discovery.

Artificial Intelligence (AI)Life ScienceSoftware
check
H1B Sponsor Likelynote

Responsibilities

Research and develop AI systems that extract and structure knowledge from diverse scientific sources
Design and fine-tune large language, multi-modal and specialized models for factual, interpretable data extraction
Build scalable pipelines for unstructured and heterogeneous scientific data, integrating text, tables, and visuals
Collaborate with domain experts to align extracted data with real-world discovery workflows
Publish research that advances the state of the art in multimodal understanding and AI-driven knowledge extraction

Qualification

Machine LearningNLPVision-Language ModelingPyTorchHugging Face TransformersLarge Language ModelsData StructuresScientific Document ParsingPassion for AICollaborative Mindset

Required

PhD (or equivalent research experience) in Computer Science, Chemistry, Materials Science, or related field
Expertise in machine learning, NLP, and vision–language modeling using PyTorch and Hugging Face Transformers
Proven ability to train, fine-tune, and evaluate LLMs and multimodal models for scientific data extraction
Strong understanding of data structures and representations used in the physical sciences
Demonstrated research impact through publications, preprints, or open-source work (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, Scientific Journals)

Preferred

Experience with multimodal fusion architectures and document-level understanding
Knowledge of scientific document parsing (OCR, table extraction, figure-caption linking)
Familiarity with knowledge graph construction or reasoning systems for science
Experience with noisy or heterogeneous real-world scientific data
Collaborative mindset and passion for advancing AI in the physical sciences

Benefits

Bonus potential
Generous early equity

Company

Lila Sciences

twittertwittertwitter
company-logo
Lila Sciences creates a scientific superintelligence platform and autonomous labs for life sciences, chemistry, and materials science. It is a sub-organization of Flagship Pioneering.

H1B Sponsorship

Lila Sciences has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (8)

Funding

Current Stage
Growth Stage
Total Funding
$550M
Key Investors
NVenturesFlagship Pioneering
2025-10-14Series A· $115M
2025-09-14Series A· $235M
2025-03-10Seed· $200M
Company data provided by crunchbase