Qbtech · 1 month ago
Senior Data Engineer
Qbtech is seeking an experienced Senior Data Engineer to join their dynamic data team. The role involves designing, developing, and maintaining scalable data pipelines and data warehouses to support advanced analytics and business intelligence initiatives, collaborating with cross-functional teams to implement best practices in data architecture and optimize data processing workflows.
Health CareHealth DiagnosticsMedical DeviceTechnical Support
Responsibilities
Design, develop, and maintain robust ETL processes using tools such as Talend, Informatica, and custom scripting with Python, Bash, or Shell Scripting
Build and manage scalable data pipelines utilizing Hadoop, Spark, Apache Hive, Azure Data Lake, and AWS services to handle large volumes of structured and unstructured data
Develop and optimize complex SQL queries for data extraction, transformation, and loading across platforms including Microsoft SQL Server, Oracle, and other relational databases
Architect and implement efficient data models for data warehouses supporting analytics and reporting needs
Collaborate with data scientists to facilitate model training by preparing clean datasets and integrating machine learning workflows
Design database schemas and ensure optimal database performance through indexing, partitioning, and normalization techniques
Integrate diverse data sources such as Linked Data and RESTful APIs into centralized repositories for comprehensive analysis
Support agile development practices by participating in sprint planning, code reviews, and continuous integration efforts
Monitor system performance, troubleshoot issues promptly, and implement improvements to ensure high availability of data services
Contribute to documentation of architecture designs, workflows, and best practices for team knowledge sharing
Qualification
Required
Design, develop, and maintain robust ETL processes using tools such as Talend, Informatica, and custom scripting with Python, Bash, or Shell Scripting
Build and manage scalable data pipelines utilizing Hadoop, Spark, Apache Hive, Azure Data Lake, and AWS services to handle large volumes of structured and unstructured data
Develop and optimize complex SQL queries for data extraction, transformation, and loading across platforms including Microsoft SQL Server, Oracle, and other relational databases
Architect and implement efficient data models for data warehouses supporting analytics and reporting needs
Collaborate with data scientists to facilitate model training by preparing clean datasets and integrating machine learning workflows
Design database schemas and ensure optimal database performance through indexing, partitioning, and normalization techniques
Integrate diverse data sources such as Linked Data and RESTful APIs into centralized repositories for comprehensive analysis
Support agile development practices by participating in sprint planning, code reviews, and continuous integration efforts
Monitor system performance, troubleshoot issues promptly, and implement improvements to ensure high availability of data services
Contribute to documentation of architecture designs, workflows, and best practices for team knowledge sharing
Extensive experience with cloud platforms such as AWS (including S3), Azure Data Lake, and related cloud services
Strong programming skills in Java, Python, VBA, Bash (Unix shell), and Shell Scripting for automation tasks
Proficiency with big data technologies including Hadoop ecosystem (HDFS), Spark (PySpark), Apache Hive, and related tools
Expertise in ETL development using Talend, Informatica or similar tools; strong SQL skills for complex query development across multiple databases like SQL Server and Oracle
Knowledge of modern data architecture concepts including Data Warehouse design, Linked Data integration, and database modeling techniques
Familiarity with analytics tools such as Looker for visualization purposes
Experience with RESTful API integration for seamless data exchange between systems
Strong analysis skills with the ability to interpret complex datasets to inform decision-making
Experience working within Agile methodologies to deliver iterative improvements efficiently
Preferred
Understanding of model training processes within machine learning workflows is a plus
Company
Qbtech
Qbtech provides objective data for diagnosing or treating patients with ADHD.
Funding
Current Stage
Growth StageTotal Funding
unknownKey Investors
Verdane
2022-09-01Private Equity
Recent News
Business Wire
2025-06-02
2025-04-08
Company data provided by crunchbase