Planned Systems International · 2 days ago
Data Engineer
Wonder how qualified you are to the job?
Maximize your interview chances
Software
Comp. & Benefits
Insider Connection @Planned Systems International
Responsibilities
Collaborate with stakeholders, DevSecOps, and data scientists to execute on the analytics roadmap.
Understand the business domain and document requirements for data pipelines enabling descriptive, predictive, and prescriptive analytics.
Analyze and explore data from large data systems and sources, perform complex manipulations including federated joins, imputation, deduping, etc.
Utilize SQL, Python, and PySpark for data analysis, cleaning, transformation, and persistence from databases and various data formats.
Write and optimize SQL and SparkSQL queries against databases and Data Lakes.
Productionize data pipelines, add monitoring, support, and operational metrics.
Create visualizations and dashboards using Tableau.
Write unit, integration, and regression tests for data pipeline jobs.
Create ERD diagrams, validate designs with prototype data models.
Collaborate with the data science team to develop optimized data models for machine learning.
Experiment with latest technologies, present data-driven solutions to stakeholders, including executive management.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
U.S. Citizenship with the ability to obtain a U.S. Government Security Clearance
Intermediate to advanced level hands-on knowledge of SQL
3-5 years’ experience with at least one of the major relational databases: Oracle, Postgres, MySQL
Hands-on experience with Python including experience with either Python Pandas DataFrame APIs for data manipulation or experience with PySpark and Spark DataFrames
Experience with structured, unstructured and semi-structured data in multiple file formats including text, CSV and JSON files
Experience implementing various data engineering patterns
Experience with one of the major BI tools (Tableau, Qlikview, PowerBI, etc)
Experience writing unit, integration and regression tests
Understanding of the machine learning life cycle
Experience with one major notebook environment (Jupyter, Collab, Databricks, etc) for Python
Effective communication, documentation and problem-solving skills
Ability to work in a fast-paced environment with a can-do attitude
Preferred
Experience with Databricks Unified Analytics Platform for data engineering is strongly preferred
Experience with DBT (Data Build Tool)
Experience with AWS DMS (Data Migration Service)
Experience with Spark and Spark SQL including structured DataFrame API is strongly preferred
Experience with AWS cloud and AWS big data technologies like Sagemaker, Athena, EMR, Glue, S3, etc
Experience with machine learning with Python and/or Spark libraries
Experience with big data file formats like Parquet & Delta
Experience creating DeltaLake or DataLake for large complex data
Benefits
Paid leave
Employer sponsored group medical, dental, and vision
Short-term and long-term disability
Life insurance
AD&D coverage
Legal services
Identity theft insurance
Accident insurance
401(k) with employer contribution match
Flexible spending account
Health saving account
Professional growth opportunities through courses, certifications, and tuition reimbursement