Lead Data Science Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Medidata Solutions ยท 2 weeks ago

Lead Data Science Engineer

Medidata Solutions is a leader in powering smarter treatments and healthier people through digital solutions for clinical trials. The Lead Data Science Engineer will apply advanced skills in data architecture and engineering, develop ETL pipelines, and collaborate with cross-functional teams to enhance data quality and support clinical innovations.

Cloud Data ServicesInformation TechnologyRisk Management
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Apply advanced skills in data architecture, data science engineering, data modeling, and data quality using modern cloud-native technologies
Develop ETL pipelines, working with vector databases, automation, and CI/CD using tools such as Python, SQL, and Git
Develop LLM applications using Retrieval-Augmented Generation (RAG) and support fine-tuning for domain-specific tasks
Analyze and manipulate both structured and unstructured data sources, ensuring high data quality and readiness for downstream consumers
Document and communicate technical work clearly to stakeholders at all levels, both technical and non-technical
Collaborate effectively in Agile environments and cross-functional teams, building secure, scalable data pipelines into Snowflake from both on-premise and cloud-based sources

Qualification

Data Science EngineeringCloud-native TechnologiesETL PipelinesData ArchitecturePythonSQLGit/GitHubSnowflakeDocker/KubernetesAgile CollaborationData QualityAI/ML IntegrationTechnical Communication

Required

Bachelor's degree in a technical or scientific field, such as Statistics, Data Science, Computer Science, or similar
7+ years of experience in roles such as Data Scientist or Data Engineer with a strong foundation in Enterprise Data Architecture and Engineering
Hands-on experience with tools and concepts such as Airflow, CDC, batch processing, and job scheduling
Hands-on experience data curation, cleansing, and annotation to support model fine-tuning and evaluation workflows
Experienced in building scalable, cloud-native data pipelines using tools and services like Streamlit, Snowflake and containerization platforms like Docker/Kubernetes
Proficient in Git/GitHub, GitHub Actions for CI/CD, and managing infrastructure as code using Terraform
Hands-on experience building high-throughput data pipelines across cloud platforms and MCP server environments
Proficient in implementing RAG architectures, vector databases, and low-latency retrieval layers
Skilled at integrating AI/ML pipelines into production-grade data infrastructure

Preferred

Experience with clinical trial data is not required, but interest to learn and understand how these data improve medical research is paramount

Benefits

Medical, dental, life and disability insurance
401(k) matching
Flexible paid time off
10 paid holidays per year

Company

Medidata Solutions

company-logo
Medidata is powering smarter treatments and healthier people through digital solutions to support clinical trials.

H1B Sponsorship

Medidata Solutions has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (68)
2024 (68)
2023 (72)
2022 (95)
2021 (77)
2020 (64)

Funding

Current Stage
Public Company
Total Funding
$20M
Key Investors
TFS Trial Form Support
2023-11-06Post Ipo Equity
2020-07-01Post Ipo Equity
2019-06-12Acquired

Leadership Team

L
Linda Magrath
Senior Vice President
linkedin
leader-logo
Paul Chang
Senior Vice President, Design and Experience
linkedin
Company data provided by crunchbase