State Street · 2 days ago
Data Engineer
Wonder how qualified you are to the job?
BankingFinance
Insider Connection @State Street
Responsibilities
Design, develop, and maintain scalable data pipelines using pyspark on Databricks, adhering to best practices and emphasizing software engineering principles.
Implement and optimize stream processing workflows using Kafka for real-time data ingestion and processing.
Utilize Parquet and Avro-formatted data files for efficient storage and retrieval, ensuring data schema compatibility and evolution.
Leverage Databricks platform on AWS to build and manage data processing workflows and analytics, while adhering to development lifecycle standards.
Harness the power of Databricks Delta Lake and Parquet files for data warehousing, query optimization, and data versioning.
Collaborate closely with data analysts and scientists to understand their requirements and provide reliable and timely data solutions.
Implement robust testing methodologies, including unit testing, integration testing, and end-to-end testing, utilizing Python packages such as pytest.
Contribute to the pyspark/Python ecosystem by creating reusable components, maintaining internal PyPI packages, and exploring other common Python packages.
Monitor data pipelines, identify and resolve issues, and ensure data integrity and quality.
Stay up-to-date with the latest trends and technologies in data engineering, software development, and testing practices, and actively share knowledge with the team.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Bachelor’s or master’s degree in computer science or a related field.
Minimum 5 years of real-world Data Engineering experience working on large-scale data projects.
Strong proficiency in pySpark, Python, and shell scripting, with a focus on software engineering best practices and a deep understanding of development lifecycle.
Experience working with workflow management tools such as Airflow.
Experience with stream processing technologies, preferably Kafka.
Familiarity with Avro data serialization format and its usage in data engineering workflows.
Expertise in using Databricks platform on AWS for data processing and analytics.
Solid understanding of data warehousing concepts and experience with Delta Lake and Parquet files.
Proficiency in SQL and experience with relational databases.
Strong testing skills, with experience in implementing and executing unit tests, integration tests, and end-to-end tests using Python packages such as pytest.
Familiarity with the Python ecosystem, including PyPI packages and their integration into data engineering workflows.
Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.
Strong communication skills and ability to effectively communicate complex technical concepts to non-technical stakeholders.
Working experience with Databricks and pyspark.
Proficiency in writing complex SQLs.
Working experience with cloud platforms like AWS or Azure (preferably AWS).
Working Experience with Airflow.
Experience working with very large datasets.
Preferred
Experience working with reporting tools such as Tableau.
Past experience working on Machine Learning projects.
Past experience working in finance.
Benefits
Flexible Work Programs
Comprehensive Medical Care
Insurance Plans
Savings Plans
Development Programs
Educational Support
Company
State Street
At State Street, we partner with institutional investors all over the world to provide comprehensive financial services, including investment management, investment research and trading, and investment servicing.
H1B Sponsorship
State Street has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Trends of Total Sponsorships
2023 (1)
Funding
Current Stage
Public CompanyTotal Funding
$6B2024-03-18Post Ipo Debt· $1B
2023-11-21Post Ipo Debt· $1.5B
2023-08-03Post Ipo Debt· $1.5B
Leadership Team
Recent News
Yahoo Movies UK
2024-04-27
Company data provided by crunchbase