ExecutivePlacements.com ยท 3 hours ago
Machine Learning Data Engineer
ExecutivePlacements.com is seeking a Machine Learning Data Engineer for their client, a Consumer Products and Software Services company. The role involves designing and building data pipelines, ingesting and preprocessing data, collaborating with ML teams, and maintaining data quality and performance.
Human ResourcesOnline PortalsRecruiting
Responsibilities
Design and Build Data Pipelines: Create efficient, reliable, streamable, and scalable data pipelines using industry-standard tools and techniques, such as TorchData, WebDataset, Apache Parquet., Python, and SQL
Data Ingestion: Develop strategies for ingesting data from data providers, ensuring data quality and consistency
Data Pre-processing: Implement parallel pre-processing to clean, transform, de-duplicate, combine and normalize data
Data Curation and Enrichment: Curate, augment, and enrich existing datasets to improve data quality and provide valuable insights to stakeholders
Synthetic Data Generation: Collaborate with synthetic data teams to generate data and incorporate into existing pipelines
Collaboration with ML Teams: Work closely with ML scientists, engineers, and product teams to understand data requirements, and collaborate on data delivery
Monitoring, Maintenance & Updating: Monitor data pipelines for performance, errors, and bottlenecks, and implement regular maintenance and updates. Stay updated with the latest trends and incorporate best practices into data pipelines
Technical Documentation: Document data pipelines, settings, and procedures for easy maintenance and knowledge sharing
Qualification
Required
Bachelors degree in Computer Science, Information Technology, or a related field
At least years of experience as a Software Engineer or Data Engineer
Strong software engineering skills, proficiency in Python
Experience with data processing tools and formats such as Apache Parquet, WebDataset, TorchData, Pandas, Shell Scripting, Protobuf, TFRecord
Knowledge of data warehouse architectures and cloud-based systems (, AWS S)
Strong problem-solving and analytical skills
Excellent communication and collaboration skills
Preferred
Masters degree in Data Science or a related field
Experience with data curation and enrichment techniques, particularly for large scale text, image and video data
Familiarity with natural language processing (NLP), machine learning (ML) concepts and frameworks (PyTorch)
Benefits
Excellent growth and advancement opportunities
Company
ExecutivePlacements.com
Online recruitment
Funding
Current Stage
Early StageCompany data provided by crunchbase