Ampstek ยท 14 hours ago
Need USC/GC Only :: Lead Data Engineer
Ampstek is seeking a Lead Data Engineer to design, develop, and maintain scalable ETL pipelines. The role involves implementing monitoring solutions, managing deployment pipelines, and executing the complete analytics lifecycle to support analytics and reporting needs.
Responsibilities
Design, develop, and maintain scalable ETL pipelines to ensure data quality and availability
Implement monitoring and alerting solutions to ensure data pipeline reliability and performance
Develop and manage deployment pipelines to facilitate continuous integration and delivery of data engineering solutions
Implement data integration solutions to support analytics and reporting needs
Execute the complete analytics lifecycle for problem solving, including:
Algorithm traditionalization
Model validation
Model prototyping
Data exploration
Data grooming
Survey varied data sources for analytic relevance, including:
External sources accessed via API
Flat files
Relational databases
Distributed file systems
Expertise in data engineering languages such as Scala (preferred) or Java, with proficiency in Python
Experience with BigData tools, particularly Spark
Proficiency in building and managing ETL pipelines
Expert-level quantitative analysis skills including interpretation of model results, consideration of causality, treatment of multicollinearity
The ability to work in compiled, high-performance languages (e.g., Scala, Java, C++)
Experience with relational databases
Strong understanding of relational databases and SQL, and familiarity with NoSQL databases
Broad experience and solid theoretical foundation on the modeling process using a
Variety of algorithmic techniques, including Machine Learning, and Graph/Network Analytics
Data pre-processing, exploratory data analysis using a variety of techniques
Basic understanding of data architecture, data warehouse, and data marts
Demonstrated ability and desire to continually expand skill set, and learn from and teach others
Qualification
Required
ETL
ML OPS
AI-ML
Data warehousing
Python
AWS
Design, develop, and maintain scalable ETL pipelines to ensure data quality and availability
Implement monitoring and alerting solutions to ensure data pipeline reliability and performance
Develop and manage deployment pipelines to facilitate continuous integration and delivery of data engineering solutions
Implement data integration solutions to support analytics and reporting needs
Execute the complete analytics lifecycle for problem solving, including: Algorithm traditionalization, Model validation, Model prototyping, Data exploration, Data grooming
Survey varied data sources for analytic relevance, including: External sources accessed via API, Flat files, Relational databases, Distributed file systems
Proficiency in building and managing ETL pipelines
Expert-level quantitative analysis skills including interpretation of model results, consideration of causality, treatment of multicollinearity
The ability to work in compiled, high-performance languages (e.g., Scala, Java, C++)
Experience with relational databases
Strong understanding of relational databases and SQL, and familiarity with NoSQL databases
Broad experience and solid theoretical foundation on the modeling process using a variety of algorithmic techniques, including Machine Learning, and Graph/Network Analytics
Data pre-processing, exploratory data analysis using a variety of techniques
Basic understanding of data architecture, data warehouse, and data marts
Demonstrated ability and desire to continually expand skill set, and learn from and teach others
Preferred
Expertise in data engineering languages such as Scala
Experience with BigData tools, particularly Spark