Scientific Data Platform Architect — Antibody Discovery jobs in United States
cer-icon
Apply on Employer Site
company-logo

Prellis Biologics · 1 month ago

Scientific Data Platform Architect — Antibody Discovery

Prellis Biologics is a pre-IPO biotech located in Berkeley, CA, focused on revolutionizing drug discovery through the integration of human biology and machine learning. The Scientific Data Platform Architect will design and build an end-to-end scientific data platform to enhance antibody discovery, ensuring data is structured, accessible, and ready for AI/ML applications.

3D PrintingBiopharmaBiotechnologyTherapeutics
check
H1B Sponsor Likelynote

Responsibilities

Own the canonical schemas (with selective JSONB), indexing/partitioning, materialized views, and stable entity IDs (samples, sequences, assays, runs)
Operate RDS/Aurora PostgreSQL, S3 for raw artifacts, and right-sized IAM/VPC access; set guardrails for backups, recovery, and monitoring (CloudWatch)
Make data Findable (catalog/registry tables, searchable metadata), Accessible (role-based access, documented APIs/exports), Interoperable (controlled vocabularies, standard formats such as CSV/Parquet, FASTA/VDJ, FCS/SPR), and Reusable (required metadata, units/QC flags, versioned tables)
Define and enforce data contracts, provenance, and lightweight review checkpoints
Build parsers/pipelines for instrument exports (CSV/TSV, FCS, ELISA/SPR/BLI), PipeBio repertoire/QC outputs, and Benchling entities via API/webhooks
Add validation, unit normalization, schema migrations, and automated checks
Create curated analytic views (assay roll-ups, QC dashboards, lineage), and implement interactive visuals (dose–response fits, sensograms, flow summaries, repertoire plots) with Plotly/Dash, Shiny, Spotfire, Streamlit, or similar
Deliver drill-downs, comparisons across runs/targets, and clean CSV/Excel exports
Build and maintain a small Shiny (R/Python) or Python app (FastAPI + Dash/Plotly/Streamlit) that is role-aware, searchable, and easy for scientists to use; deploy simply (EC2/ECS/Docker)
Publish feature-ready Parquet/Arrow datasets (sequence features, developability metrics, assay labels like KD/EC50, clonotypes) with dataset versioning, timestamps, and lineage
Provide reproducible extracts/snapshots for training, and ingest model predictions/scores back into Postgres and the UI
Set patterns and code standards, mentor contributors, review designs, and coordinate with Biology, Analytics, and QA/Compliance
Keep cost/performance sane; evolve the roadmap as assays and throughput grow
A clear Postgres schema with stable IDs, required metadata, and provenance supporting FAIR discovery
Automated ETL for Benchling + PipeBio + instruments, with validation and unit normalization
A usable app delivering interactive analytics & visualizations scientists rely on daily
ML-ready datasets with documented contracts; backups, monitoring, and a published data dictionary/metadata guide

Qualification

PostgreSQLPythonETLAWSData visualizationFAIR principlesData modelingTechnical leadershipCollaborationMentoring

Required

Bachelors degree is Computer Science or similar field
7+ years building data platforms or complex data products; expert SQL/PostgreSQL (schema design, optimization, migrations)
Strong Python or R for data engineering and app development (Pandas/SQLAlchemy or Shiny/Plotly/Streamlit)
Proven ETL experience from files/APIs and pragmatic scheduling (cron/Airflow/Prefect—keep it simple)
Practical AWS with Postgres on RDS/Aurora, S3 for storage, basic IAM/VPC, and CloudWatch for monitoring
Hands-on analytics & visualization for scientific datasets
Working knowledge of FAIR principles and shaping AI/ML-ready datasets (features, labels, versioned exports)

Preferred

Benchling developer experience (entities, webhooks) and familiarity with PipeBio outputs
Exposure to lab data types (FCS, BLI/SPR, ELISA, NGS summaries, PDB) and data integrity concepts (ALCOA+, 21 CFR Part 11 basics)
Light containerization (Docker) and deploying a small app on EC2/ECS
Experience round-tripping model outputs to a database/UI; comfort with Jupyter/scikit-learn/PyTorch

Benefits

A competitive employee benefits package, including group medical, dental and vision coverage, life and disability insurance, flexible spending accounts an a 401(k) plan
Stock-based long term incentives
Bonus plan
Holiday package including a 1+ week winter shutdown
Flexible work models, including remote and hybrid working arrangements, where possible

Company

Prellis Biologics

twittertwittertwitter
company-logo
Prellis employs holographic tissue printing technology with fully human antibody discovery, in vitro human disease and ADME/Tox models.

H1B Sponsorship

Prellis Biologics has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
2023 (2)

Funding

Current Stage
Growth Stage
Total Funding
$79.37M
Key Investors
Celesta CapitalKhosla VenturesTrue Ventures
2025-03-10Series C
2023-11-17Series C
2022-08-10Series C· $35M

Leadership Team

leader-logo
Michael Nohaile
CEO
linkedin
Company data provided by crunchbase