Data Architect II jobs in United States
info-icon
This job has closed.
company-logo

GSK · 4 weeks ago

Data Architect II

GSK is a global biopharma company focused on uniting science, technology, and talent to advance health. As a Data Architect II, you will design data architectures and collaborate with teams to enhance data accessibility and support AI workflows, contributing to GSK's mission of accelerating medical discovery.

BiotechnologyHealth CarePharmaceutical
check
H1B Sponsor Likelynote

Responsibilities

Partner with the Scientific Knowledge Engineering team to develop physical data models to build fit-for-purpose data products
Design data architecture aligned with enterprise-wide standards to promote interoperability
Collaborate with the platform teams and data engineers to maintain architecture principles, standards, and guidelines
Design data foundations that support GenAI workflows including RAG (Retrieval-Augmented Generation), vector databases, and embedding pipelines
Work across business areas and stakeholders to ensure consistent implementation of architecture standards
Lead reviews and maintain architecture documentation and best practices for Onyx and our stakeholders
Adopt security-first design with robust authentication and resilient connectivity
Provide best practices and leadership, subject matter, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors

Qualification

Data architectureBig Data platformsCloud data architectureData engineeringAI/ML workflowsData warehouseData lakeRelational databasesDimensional databasesNoSQL technologiesPythonScalaJavaAgile frameworksPharmaceutical backgroundCommunication

Required

Bachelor's degree in computer science, engineering, Data Science or similar discipline
5+ years of experience in data architecture, data engineering, or related fields in pharma, healthcare, or life sciences R&D
3+ years' experience of defining architecture standards, patterns on Big Data platforms
3+ years' experience with data warehouse, data lake, and enterprise big data platforms
3+ years' experience with enterprise cloud data architecture (preferably Azure or GCP) and delivering solutions at scale
3+ years of hands-on relational, dimensional, and/or analytic experience (using RDBMS, dimensional, NoSQL data platform technologies, and ETL and data ingestion protocols)

Preferred

Master's or PhD in computer science, engineering, Data Science or similar discipline
Deep knowledge and use of at least one common programming language: e.g., Python, Scala, Java
Experience with AI/ML data workflows: feature stores, vector databases, embedding pipelines, model serving architectures
Familiarity with GenAI/LLM data patterns: RAG architectures, prompt engineering data requirements, fine-tuning data preparation
Experience with GCP data/analytics stack: Spark, Dataflow, Dataproc, GCS, Bigquery
Experience with enterprise data tools: Ataccama, Collibra, Acryl
Experience with Agile frameworks: SAFe, Jira, Confluence, Azure DevOps
Experience applying CI/CD principles to data solution
Experience with Spark and RAG-based architectures for data science and ML use cases
Strong communication skills—ability to explain technical concepts to non-technical stakeholders
Pharmaceutical, healthcare, or life sciences background

Benefits

Health care and other insurance benefits (for employee and family)
Retirement benefits
Paid holidays
Vacation
Paid caregiver/parental and medical leave

Company

We are uniting science, technology and talent to get ahead of disease together. Our community guidelines: https://gsk.to/socialmedia

H1B Sponsorship

GSK has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (45)
2024 (56)
2023 (54)
2022 (53)
2021 (54)
2020 (72)

Funding

Current Stage
Public Company
Total Funding
$25.51M
Key Investors
CARB-X
2021-03-02Grant· $18M
2020-09-23Grant· $7.51M
1978-01-13IPO

Leadership Team

leader-logo
Julie Brown
CFO
linkedin
leader-logo
Mike Elmore
SVP & Chief Information Security Officer, GSK
linkedin
Company data provided by crunchbase