Data Engineer @ myGwork - LGBTQ+ Business Community | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
Data Engineer jobs in Boston, MA
Be an early applicantLess than 25 applicants
expire-info-iconThis job has closed.
company-logo

myGwork - LGBTQ+ Business Community · 3 days ago

Data Engineer

Wonder how qualified you are to the job?

ftfMaximize your interview chances
Internet

Insider Connection @myGwork - LGBTQ+ Business Community

Discover valuable connections within the company who might provide insights and potential referrals, giving your job application an inside edge.

Responsibilities

Design, develop, and maintain scalable data pipelines using pyspark on Databricks, adhering to best practices and emphasizing software engineering principles.
Implement and optimize stream processing workflows using Kafka for real-time data ingestion and processing.
Utilize Parquet and Avro-formatted data files for efficient storage and retrieval, ensuring data schema compatibility and evolution.
Leverage Databricks platform on AWS to build and manage data processing workflows and analytics, while adhering to development lifecycle standards.
Harness the power of Databricks Delta Lake and Parquet files for data warehousing, query optimization, and data versioning.
Collaborate closely with data analysts and scientists to understand their requirements and provide reliable and timely data solutions.
Implement robust testing methodologies, including unit testing, integration testing, and end-to-end testing, utilizing Python packages such as pytest.
Contribute to the pyspark/Python ecosystem by creating reusable components, maintaining internal PyPI packages, and exploring other common Python packages.
Monitor data pipelines, identify and resolve issues, and ensure data integrity and quality.
Stay up-to-date with the latest trends and technologies in data engineering, software development, and testing practices, and actively share knowledge with the team.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Data EngineeringPySparkPythonShell ScriptingSoftware EngineeringDevelopment LifecycleWorkflow ManagementAirflowStream ProcessingKafkaAvroData SerializationDatabricksAWSData WarehousingDelta LakeParquet FilesSQLRelational DatabasesTestingUnit TestsIntegration TestsEnd-to-End TestsPyPI PackagesPython EcosystemTechnical ConceptsCloud PlatformsAzureLarge DatasetsProblem-Solving

Required

Bachelor's or master's degree in computer science or a related field.
Minimum 5 years of real-world Data Engineering experience working on large-scale data projects.
Strong proficiency in pySpark, Python, and shell scripting, with a focus on software engineering best practices and a deep understanding of development lifecycle.
Experience working with workflow management tools such as Airflow.
Experience with stream processing technologies, preferably Kafka.
Familiarity with Avro data serialization format and its usage in data engineering workflows.
Expertise in using Databricks platform on AWS for data processing and analytics.
Solid understanding of data warehousing concepts and experience with Delta Lake and Parquet files.
Proficiency in SQL and experience with relational databases.
Strong testing skills, with experience in implementing and executing unit tests, integration tests, and end-to-end tests using Python packages such as pytest.
Familiarity with the Python ecosystem, including PyPI packages and their integration into data engineering workflows.
Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.
Strong communication skills and ability to effectively communicate complex technical concepts to non-technical stakeholders.
Working experience with Databricks and pyspark.
Proficiency in writing complex SQLs.
Working experience with cloud platforms like AWS or Azure (preferably AWS).
Working Experience with Airflow.
Experience working with very large datasets.

Preferred

Experience working with reporting tools such as Tableau.
Past experience working on Machine Learning projects.
Past experience working in finance.

Benefits

Medical care
Insurance
Savings plans
Flexible Work Programs
Development programs
Educational support
Paid volunteer days
Matching gift programs
Employee networks

Company

myGwork - LGBTQ+ Business Community

twittertwittertwitter
company-logo
myGwork is the largest global platform for the LGBTQ+ business community.

Funding

Current Stage
Early Stage
Total Funding
$4.77M
Key Investors
24 HaymarketInnovate UK
2023-08-17Series Unknown· $1.66M
2023-08-17Grant· Undisclosed
2021-12-07Series A· $2.12M

Leadership Team

leader-logo
Adrien Gaubert
Co-Founder & CMO
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot