Dassault Systèmes · 1 day ago
Lead Data Science Engineer
Medidata is powering smarter treatments and healthier people through digital solutions to support clinical trials. The Lead Data Science Engineer will apply advanced data architecture and engineering skills to develop ETL pipelines and LLM applications, ensuring high data quality and collaboration across teams.
AerospaceAnalyticsAppsArtificial Intelligence (AI)Big DataInformation TechnologyInternetProduct DesignSoftwareVirtual Reality
Responsibilities
Apply advanced skills in data architecture, data science engineering, data modeling, and data quality using modern cloud-native technologies
Develop ETL pipelines, working with vector databases, automation, and CI/CD using tools such as Python, SQL, and Git
Develop LLM applications using Retrieval-Augmented Generation (RAG) and support fine-tuning for domain-specific tasks
Analyze and manipulate both structured and unstructured data sources, ensuring high data quality and readiness for downstream consumers
Document and communicate technical work clearly to stakeholders at all levels, both technical and non-technical
Collaborate effectively in Agile environments and cross-functional teams, building secure, scalable data pipelines into Snowflake from both on-premise and cloud-based sources
Qualification
Required
Bachelor's degree in a technical or scientific field, such as Statistics, Data Science, Computer Science, or similar
7+ years of experience in roles such as Data Scientist or Data Engineer with a strong foundation in Enterprise Data Architecture and Engineering
Hands-on experience with tools and concepts such as Airflow, CDC, batch processing, and job scheduling
Hands-on experience data curation, cleansing, and annotation to support model fine-tuning and evaluation workflows
Experienced in building scalable, cloud-native data pipelines using tools and services like Streamlit, Snowflake and containerization platforms like Docker/Kubernetes
Proficient in Git/GitHub, GitHub Actions for CI/CD, and managing infrastructure as code using Terraform
Hands-on experience building high-throughput data pipelines across cloud platforms and MCP server environments. Proficient in implementing RAG architectures, vector databases, and low-latency retrieval layers. Skilled at integrating AI/ML pipelines into production-grade data infrastructure
Preferred
Experience with clinical trial data is not required, but interest to learn and understand how these data improve medical research is paramount
Benefits
Medical, dental, life and disability insurance
401(k) matching
Flexible paid time off
10 paid holidays per year
Company
Dassault Systèmes
Dassault Systèmes is a catalyst for human progress.
H1B Sponsorship
Dassault Systèmes has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2022 (1)
2020 (1)
Funding
Current Stage
Public CompanyTotal Funding
unknown1999-04-01IPO
Recent News
GlobeNewswire
2026-01-07
2026-01-03
2025-11-26
Company data provided by crunchbase