SIGN IN
Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Soho Square Solutions · 5 hours ago

Data Engineer

Soho Square Solutions is seeking a Data Engineer to join their team in New York, NY. The role involves developing and migrating applications to cloud services, building scalable ETL pipelines, and collaborating with stakeholders to meet data requirements.
Enterprise SoftwareFinanceBig DataCrypto & Web3BlockchainFinancial ServicesProject Management
check
Growth Opportunities
badNo H1Bnote
Hiring Manager
Virangana Bhalerao
linkedin

Responsibilities

Work on migrating applications from an on-premises location to the cloud service providers
Develop products and services on the latest technologies through contributions in development, enhancements, testing and implementation
Develop, modify, extend code for building cloud infrastructure, and automate using CI/CD pipeline
Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology
Perform problem analysis, data analysis, reporting, and communication
Work with peers across the system to define and implement best practices and standards
Assess applications and help determine the appropriate application infrastructure patterns
Use the best practices and knowledge of internal or external drivers to improve products or services

Qualification

ETL using DatabricksSQLAWS ComputeCloud infrastructure provisioningPythonCI/CD automationContainer OrchestrationData pipeline solutionsProblem analysisData qualityCollaboration

Required

Hands-on experience in building ETL using Databricks SaaS infrastructure
Experience in developing data pipeline solutions to ingest and exploit new and existing data sources
Expertise in leveraging SQL, programming language like Python and ETL tools like Databricks
Perform code reviews to ensure requirements, optimal execution patterns and adherence to established standards
Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue)
Advanced understanding of Container Orchestration services including Docker and Kubernetes, and a variety of AWS tools and services
Good understanding of AWS Identity and Access management, AWS Networking and AWS Monitoring tools
Proficiency in CI/CD and deployment automation using GITLAB pipeline
Proficiency in Cloud infrastructure provisioning tools e.g., Terraform
Proficiency in one or more programming languages e.g., Python, Scala
Experience in Starburst, Trino and building SQL queries in federated architecture
Good knowledge of Lake house architecture
Design, develop, and optimize scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark and Scala)
Build data ingestion workflows from various sources (structured, semi-structured, and unstructured)
Develop reusable components and frameworks for efficient data processing
Implement best practices for data quality, validation, and governance
Collaborate with data architects, analysts, and business stakeholders to understand data requirements
Tune Spark jobs for performance and scalability in a cloud-based environment
Maintain robust data lake or Lakehouse architecture
Ensure high availability, security, and integrity of data pipelines and platforms
Support troubleshooting, debugging, and performance optimization in production workloads

Company

Soho Square Solutions

twittertwitter
company-logo
Soho Square Solutions expertise in financial services, enterprise risk, information security, project management, big data and blockchain.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Vijay Veerachandran
CEO, Founder and Managing Partner
linkedin
Company data provided by crunchbase