Infinitive · 22 hours ago
Data Engineer (Python/PySpark/AWS)
Infinitive is a data and AI consultancy that enables clients to modernize and operationalize their data. They are seeking a highly skilled Data Engineer to design, develop, and maintain data infrastructure, focusing on ETL processes, data integration, and CI/CD implementation.
AdvertisingInformation TechnologyInternetMarketing
Responsibilities
Collaborate with cross-functional teams to understand data requirements and design robust data architecture solutions
Develop data models and schema designs to optimize data storage and retrieval
Implement ETL processes to extract, transform, and load data from various sources
Ensure data quality, integrity, and consistency throughout the ETL pipeline
Utilize your expertise in Python and PySpark to develop efficient data processing and analysis scripts
Optimize code for performance and scalability, keeping up-to-date with the latest industry best practices
Integrate data from different systems and sources to provide a unified view for analytical purposes
Collaborate with data scientists and analysts to implement solutions that meet their data integration needs
Design and implement streaming workflows using PySpark Streaming or other relevant technologies
Develop batch processing workflows for large-scale data processing and analysis
Implement and maintain continuous integration and continuous deployment (CI/CD) pipelines using Jenkins or GitHub Actions
Automate testing, code deployment, and monitoring processes to ensure the reliability of data pipelines
Qualification
Required
Bachelor's or Master's degree in Computer Science, Information Technology, or a related field
Proven experience as a Data Engineer or similar role
Strong programming skills in Python and expertise in PySpark for both batch and streaming data processing
Hands-on experience with ETL tools and processes
Familiarity with CI/CD tools such as Jenkins or GitHub Actions
Solid understanding of data modeling, database design, and data warehousing concepts
Excellent problem-solving and analytical skills
Strong communication and collaboration skills
Candidates must be local to the Washington D.C. metro area
Preferred
Knowledge of cloud platforms such as AWS, Azure, or Google Cloud
Experience with version control systems (e.g., Git)
Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes)
Understanding of data security and privacy best practices