Baylor Genetics ยท 8 hours ago
Principal Bioinformatics Data Scientist
Baylor Genetics is seeking an experienced and visionary Principal Bioinformatics Data Scientist to join our Bioinformatics R&D and Data Science team. This individual will play a pivotal role in advancing genomic analysis capabilities through innovative computational methods and ML/AI-driven models to deliver clinically actionable genomic insights.
BiotechnologyHealth Care
Responsibilities
Design, develop, and optimize computational algorithms, statistical models, and machine learning/AI approaches for genomic data analysis
Develop algorithms for variant prioritization, pathogenicity prediction, phenotype-genotype association, and diagnostic decision support
Lead efforts to apply deep learning and predictive modeling to variant interpretation, phenotype correlation, and diagnostic decision support
Develop and optimize algorithms for secondary and tertiary genomic analysis, variant calling, variant annotation, ACMG classification, and data-driven interpretation of genomic findings
Apply deep learning architectures (CNNs, RNNs, GNNs, transformers) and probabilistic modeling approaches to improve variant calling, variant interpretation, and disease prediction
Design and optimize Bayesian, regression, and ensemble models to quantify uncertainty, improve confidence scoring, and support clinical decision-making
Develop feature engineering and dimensionality reduction strategies for multi-modal data integration (genomic, transcriptomic, phenotypic, and clinical)
Serve as a subject matter expert (SME) in data science applications for bioinformatics pipeline development, including secondary and tertiary analysis, variant interpretation, and clinical reporting automation
Drive enhancements to existing bioinformatics pipelines for improved accuracy, performance, and interpretability
Integrate and analyze multi-omics datasets (genomic, transcriptomic, phenotypic, and clinical) to extract meaningful biological and clinical insights
Evaluate and enhance existing bioinformatics and data science pipelines for improved accuracy, speed, and scalability
Collaborate with engineering teams to integrate new algorithms and frameworks into production-grade analysis pipelines
Drive novel genomic data platform development to support a cohesive data ecosystem
Drive innovation in genomics data science through research and development of novel analytical methodologies
Stay current with emerging tools, frameworks, and technologies in ML, AI, bioinformatics, and genomics, and guide the team in adopting best practices
Partner closely with laboratory scientists, clinical geneticists, software engineers, and data engineers to translate scientific insights into clinical applications
Serve as a scientific and technical thought leader across cross-disciplinary projects
Evaluate emerging technologies, frameworks, and methodologies to ensure Baylor Genetics remains at the forefront of computational genomics innovation
Present findings and strategic recommendations to executive and scientific leadership
Provide technical mentorship and guidance to bioinformatics scientists, data scientists, and software engineers
Contribute to strategic planning and technical direction for the Bioinformatics R&D and Data Science group
Qualification
Required
Master's and higher degree (PhD preferred) in Bioinformatics, Computer Science, Data Science, Computational Biology, or a related field
6+ years of professional experience in genomic data science related to bioinformatics, computational genomics, or similar, including at least 3 years in a senior or lead role
Proven track record of statistical, machine learning, and AI model development using genomic and clinical data
Strong experience with secondary and tertiary genomic analysis (alignment, variant calling, annotation, and interpretation)
Experience in data Lakehouse (Databricks, Snowflakes), and precision health platforms (DNAnexus, Velsera)
Experience in big data, data ETL, data visualization, workflow orchestration/logging, and databases (including SQL, no-SQL, and graph-based)
Demonstrated experience working in a clinical or diagnostic genetics environment is highly desirable
Proficient in Python, R, C/C++, Java, or similar programming languages
Expertise in machine learning frameworks (e.g., TensorFlow, PyTorch, Scikit-learn, XGBoost)
Advanced understanding of statistical modeling, including Bayesian inference, GLMs, mixed models, and resampling methods
Experience applying deep learning architectures (transformers, CNNs, GNNs) to genomic and biomedical data
Deep understanding of statistical analysis, data modeling, and computational methods used in genomics
Experience with NGS data formats and genome databases
Familiarity with cloud computing environments (Azure, AWS, GCP) and distributed computing frameworks (e.g., Spark, Dask)
Deep knowledge of statistical modeling, dimensionality reduction, and data visualization
Familiarity with CI/CD, containerization (Docker/Kubernetes), and version control (Git)
Exceptional analytical, problem-solving, and critical thinking skills
Ability to translate complex data-driven analyses into actionable biological and clinical insights
Excellent written and verbal communication skills, with the ability to communicate effectively across disciplines
Deep understanding of both computational methods and biological context
Demonstrated leadership in cross-functional team environments
Passion for innovation in precision medicine and clinical genomics
Company
Baylor Genetics
Baylor Genetics offers a full spectrum of cost-effective, genetic testing, and provides clinically relevant solutions.
H1B Sponsorship
Baylor Genetics has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (3)
2024 (1)
2023 (3)
2022 (1)
2021 (1)
2020 (3)
Funding
Current Stage
Late StageRecent News
Company data provided by crunchbase