AI Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

C the Signs · 1 month ago

AI Data Engineer

C the Signs is focused on shaping the future of healthcare through AI technology. The Data Engineer will be responsible for developing and optimizing data pipelines for healthcare datasets, ensuring data quality and compliance while collaborating with data scientists and machine learning engineers.

Health CareHospitality
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning
Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets
Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets
Implement robust data cleaning, validation, and transformation processes to ensure data quality and integrity
Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models
Work with the team to identify and acquire new data sources, ensuring compliance with relevant healthcare regulations (e.g., HIPAA)
Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability
Document data engineering processes, data models, and data dictionaries
Stay up-to-date with the latest advancements in data engineering, big data technologies, and machine learning

Qualification

Data engineering principlesBig data technologiesData warehousingPythonCloud servicesApache SparkData modelingHealthcare data experienceMachine learning conceptsData orchestration toolsProblem-solvingCommunication skillsInterpersonal skills

Required

Bachelor's degree in Computer Science, Engineering, or a related field
Proven experience as a Data Engineer, with a focus on big data technologies
Strong proficiency in programming languages such as Python, Scala, or Java
Extensive experience with data warehousing, ETL processes, and data modeling
Experience with major cloud providers (e.g., AWS, GCP, Azure) and their data storage and processing services
Hands-on experience with big data frameworks like Apache Spark for distributed processing
Excellent problem-solving skills and the ability to work independently and as part of a team
Strong communication and interpersonal skills

Preferred

Master's degree in a related field
Experience with healthcare data and a good understanding of healthcare data standards (e.g., FHIR, HL7)
Familiarity with machine learning concepts and LLM fine-tuning processes
Experience with data orchestration tools (e.g., Apache Airflow)

Benefits

Competitive salary and benefits package
Flexible working arrangements (remote or hybrid options available)
The opportunity to work on life-changing AI technology that directly impacts patient outcomes
Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity
Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare

Company

C the Signs

twittertwittertwitter
company-logo
C the Signs gives GPs the ability to simultaneously check combinations of signs, symptoms and risk factors, in an easy to use format.

Funding

Current Stage
Growth Stage
Total Funding
$16.08M
Key Investors
Khosla VenturesCancerX Startup AcceleratorMMC Ventures
2025-01-30Series Unknown· $8M
2024-02-26Non Equity Assistance
2022-02-04Series A· $6.77M

Leadership Team

leader-logo
Greg Joondeph-Breidbart
Chief Technology Officer
linkedin
Company data provided by crunchbase