Senior/Staff Scientist, Data Science jobs in United States
cer-icon
Apply on Employer Site
company-logo

Glyphic Biotechnologies · 3 months ago

Senior/Staff Scientist, Data Science

Glyphic Biotechnologies is developing a revolutionary single-molecule proteome sequencing platform aimed at transforming life science discovery. They are seeking a highly motivated and experienced Senior/Staff Data Scientist to advance this technology by designing algorithms, developing machine learning models, and collaborating with a team of scientists and engineers.

BiotechnologyHealth CareLife Science

Responsibilities

Design and implement novel algorithms to analyze proteomics data that no one has ever seen before
Develop machine learning models that can extract meaningful insights from complex, noisy biological signals
Develop and optimize algorithms for analyzing high-dimensional chemistry and NGS data, including single cell, spatial data, and LCMS data outputs
Build models that reveal how parameters and molecular interfaces drive outcomes, including surface interactions and molecule-target binding
Design and execute biostatistical analyses using Python and/or R to uncover significant trends, model experimental outcomes, and inform data-driven decision-making
Apply machine learning to guide experiment design, identify key parameters, and optimize workflows for efficiency and reproducibility
Develop clear, insightful visualizations that make complex, high-dimensional results understandable and actionable for scientists and stakeholders
Help define metrics and visualizations that clarify high-dimensional relationships for scientists and stakeholders
Partner with wet lab, hardware, and software teams to translate experimental goals into computational strategies
Create ETL pipelines that clean, normalize, and integrate diverse datasets (sequencing reads, LCMS spectra, metadata) into analysis-ready formats
Combine off-the-shelf pipelines (basecalling, variant calling, deconvolution) with custom scripts to deliver end-to-end solutions
Continuously improve throughput and data quality by automating QC steps and integrating feedback from experiments
Establish best practices for code quality, testing, and deployment that will scale with our growing team

Qualification

PythonRMachine LearningBioinformaticsData VisualizationNext Generation SequencingETL PipelinesCloud PlatformsChemistry Data ScienceSoft Skills

Required

PhD in Computer Science, Bioinformatics, Computational Biology, Biostatistics or related field with 4+ (Senior) or 6+ (Staff) years of hands-on experience
Proven ability to model and interpret high-dimensional datasets with numerous interacting variables, uncovering statistically robust patterns and causal relationships
Competency in chemistry data science (e.g., interpreting LCMS data, utilizing deconvolution tools, understanding surface chemistry and molecule-target interactions)
Competency in next generation sequencing, including familiarity with multi-omics, error modeling, and basecalling
Expertise in Python and/or R for biostatistical analysis, including data wrangling, statistical modeling, and visualization of high-dimensional experimental results
Experience designing ML models for experimental data and deploying pipelines (Snakemake, Nextflow)
Familiarity with ML frameworks (PyTorch, TensorFlow) and data science libraries (pandas, numpy, scipy)
Experience building automated data pipelines and infrastructure for scalable analysis (cloud, Docker/Kubernetes)
Experience with cloud platforms (AWS, GCP, or Azure) and containerization tools (Docker, Kubernetes)
Proficiency with data visualization tools (matplotlib, seaborn, plotly) and Jupyter notebooks
Familiarity with version control (git) and pipeline workflow systems (Snakemake, Nextflow, etc.)

Preferred

Ability to work in performant languages (C++, Rust, Julia, or CUDA)
Ability to develop solutions that optimize the utilization of large-scale data storage, cloud processing infrastructure, and distributed computing
Direct proteomics experience (mass spectrometry, multiplex assays, etc.)
Deep learning experience with time-series data, signal processing, or sequence modeling
Ability to build and deploy scalable ML pipelines using PyTorch/TensorFlow for real-time protein sequence analysis
Experience with MLOps tools and practices for model deployment and monitoring
Experience building commercially successful life science tools that other scientists actually use and love
Previous startup or fast-paced industry (e.g., skunkworks) experience

Benefits

Employee Stock Option Plan
100% Health Plan Coverage for Employees & Dependents (Medical, Dental, & Vision)
Employer Retirement Contributions to 401(k)
Generous Paid Time Off
Paid Maternity and Paternity Leave
Health & Wellbeing Program
Office Snacks and Beverages
Regular Team Bonding Activities

Company

Glyphic Biotechnologies

twittertwittertwitter
company-logo
Glyphic Biotechnologies is a biotechnology company that develops a protein sequencing platform.

Funding

Current Stage
Early Stage
Total Funding
$45.78M
Key Investors
FoundersX VenturesLongeVCNational Institutes of Health
2025-09-26Series A· $38M
2024-11-25Seed
2024-01-25Seed

Leadership Team

leader-logo
Joshua Yang
Co-Founder and CEO
linkedin
leader-logo
Daniel Estandian
CTO and Co-Founder
linkedin
Company data provided by crunchbase