Allen Institute · 3 days ago
Director of Machine Learning
The Allen Institute is dedicated to advancing bioscience and improving human health through open science initiatives. They are seeking a Director of Machine Learning to lead ML strategy and execution for projects focused on biological design, overseeing a central ML team and collaborating across various project teams.
Neuroscience
Responsibilities
In partnership with the Executive Director and in collaboration with the Allen Technology Office, define and own the ML strategy for the enhancer flywheel and additional synthetic biology flywheels, including success metrics and roadmaps
Build and manage a central ML team, plus ML/data scientists embedded in project teams
Architect and implement sequence-to-function and generative models for regulatory element and other DNA, RNA, and protein design, leveraging state-of-the-art architectures (CNNs, transformers, diffusion, etc.)
Design and optimize DBTL loops via collaboration with project teams, e.g., supporting assay design, active learning tactics, assay configuration, and benchmarking
Supervise quantitative analysis and QC of high-throughput assays (e.g., MPRA, single-cell data), integrating external datasets such as scATAC-seq and RNA-seq for transfer learning
Prioritize projects based on organizational goals, collaborating cross-functionally to ensure timely, high-quality delivery
Establish ML best practices across projects (code quality, experiment tracking, model and data versioning, documentation, reproducibility)
Partner with data/engineering teams in the Office of the CTO to define and maintain the computational infrastructure required for large-scale sequence modeling and genomics data integration
Serve as the primary program ML representative, clearly communicating strategy, trade-offs, and results to project leads, leadership, and external collaborators, and contributing to publications and presentations
Propose and develop ML partnerships across academia, biotech, non-profits, and industry in support of our mission
Qualification
Required
Ph.D. in Computer Science, Computational Biology, Statistics, Physics, or related field; or equivalent combination of degree and experience
5+ years of post-Ph.D. (or equivalent) experience building, training, and deploying ML models in a research or product environment
Deep expertise in ML applied to biological sequences or structured biological data (e.g., regulatory genomics, transcriptional modeling, protein/DNA design)
Strong proficiency in Python and at least one modern ML framework (e.g., PyTorch, JAX, or TensorFlow)
Proven track record of technical leadership: mentoring scientists/engineers, setting standards, and delivering complex ML systems
Excellent communication skills and ability to collaborate effectively with both computational and experimental scientists
Preferred
Demonstrated experience integrating diverse datasets (e.g., ATAC-seq, RNA-seq, single-cell data) into predictive or generative models
Research experience in regulatory genomics, enhancers/promoters, transcription factor binding, or MPRA-based model training
Experience with AI-driven protein design tools such as RFdiffusion, ProteinMPNN, or comparable workflows
Hands-on work with DBTL loops in synthetic biology, including active learning, experiment selection, or closed-loop optimization
Experience with generative models for biological sequences (e.g., autoregressive, VAE, diffusion, RL-based sequence design)
Prior experience leading ML efforts in small, fast-moving, or start-up-style research environments
Strong publication or open-source record in ML for biology, sequence modeling, or synthetic biology
Benefits
Medical
Dental
Vision
Basic life insurance
401k plan
Paid time off
Company
Allen Institute
The Allen Institute is dedicated to answering some of the biggest questions in bioscience and accelerating research worldwide.
H1B Sponsorship
Allen Institute has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (41)
2024 (33)
2023 (25)
2022 (26)
2021 (17)
2020 (7)
Funding
Current Stage
Late StageRecent News
2025-12-25
2025-12-24
Company data provided by crunchbase