Data Architect (AWS + Databricks) @ Tredence Inc. | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Data Architect (AWS + Databricks) jobs in United States
98 applicants
company-logo

Tredence Inc. · 6 days ago

Data Architect (AWS + Databricks)

ftfMaximize your interview chances
AnalyticsArtificial Intelligence (AI)
check
Growth Opportunities
check
H1B Sponsor Likelynote

Insider Connection @Tredence Inc.

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Implement scalable and sustainable data engineering solutions using tools such as Databricks, Azure, Apache Spark, and Python. The data pipelines must be created, maintained, and optimized as workloads move from development to production for specific use cases.
Architecture Design Experience for Cloud and Non-cloud platforms
Expertise with various ETL technologies and familiar with ETL tools
Scrum Management experience with medium and large-scale projects
Experience to whiteboard enterprise level architectures
Must have extensive Range of knowledge of cloud/on premise tools and architectures
Experience to implement large scale hybrid cloud platform & applications
Knowledge of one or more scripting language
Experience with CI/CD
Ability to set and lead the technical vision while balancing business drivers
Ability to understand AWS EMR (Elastic MapReduce).
Execute Change Tasks including but not limited to types Partition tables using partitioning strategy
Creating and optimizing star schema models in a Dedicated SQL Pool
Experience on Data Migration project from AWS EMR to Databricks. security provisioning schema creation DB role creation DDL execution data restoration etc.
Review workload and provide recommendations for performance optimization and operational efficiency
Proactively adjust capacity based on current utilization and upcoming usage guestimate projections
Document and advise developers and end-users about best practices and housekeeping
Work on incidents / onboarding issues related to Pivoting to Cloud from on-prem datastores.
Be able to identify performance bottlenecks Monitor Azure Synapse Analytics using Dynamic Management Views
Build performing data with Table Distribution and Index
Design, develop, and optimize ETL/ELT data pipelines using Databricks on AWS.
Collaborate with data scientists, analysts, and stakeholders to understand business requirements and provide technical solutions.
Implement big data processing solutions using Spark on Databricks.
Manage and maintain Databricks clusters and optimize resource utilization to improve performance.
Develop and maintain CI/CD pipelines for data ingestion and transformation processes.
Integrate Databricks with other AWS services such as S3, Redshift, Glue, Athena, and Lambda.
Implement and manage data lakes using AWS S3 and Delta Lake architecture.
Ensure data quality, governance, and security by implementing best practices.
Monitor, troubleshoot, and optimize Databricks jobs and clusters for performance and cost efficiency.
Automate workflows and support real-time data streaming processes using Kafka, Kinesis, or AWS Glue.
Work with DevOps teams to manage infrastructure as code using tools like Terraform or AWS CloudFormation.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Cloud PlatformsDatabricksApache SparkPythonETL technologiesAWS EMRCI/CDData MigrationData LakesInfrastructure as CodeScrum ManagementSQLScalaData GovernanceData Security

Required

Cloud Platforms (broad knowledge of basic utilities and in-depth knowledge of more than one of the cloud platforms)
Implement scalable and sustainable data engineering solutions using tools such as Databricks, Azure, Apache Spark, and Python. The data pipelines must be created, maintained, and optimized as workloads move from development to production for specific use cases.
Architecture Design Experience for Cloud and Non-cloud platforms
Expertise with various ETL technologies and familiar with ETL tools
Scrum Management experience with medium and large-scale projects
Experience to whiteboard enterprise level architectures
Must have extensive Range of knowledge of cloud/on premise tools and architectures
Experience to implement large scale hybrid cloud platform & applications
Knowledge of one or more scripting language
Experience with CI/CD
Ability to set and lead the technical vision while balancing business drivers
Ability to understand AWS EMR (Elastic MapReduce).
Execute Change Tasks including but not limited to types Partition tables using partitioning strategy
Creating and optimizing star schema models in a Dedicated SQL Pool
Experience on Data Migration project from AWS EMR to Databricks. security provisioning schema creation DB role creation DDL execution data restoration etc.
Review workload and provide recommendations for performance optimization and operational efficiency
Proactively adjust capacity based on current utilization and upcoming usage guestimate projections
Document and advise developers and end-users about best practices and housekeeping
Work on incidents / onboarding issues related to Pivoting to Cloud from on-prem datastores.
Be able to identify performance bottlenecks Monitor Azure Synapse Analytics using Dynamic Management Views
Build performing data with Table Distribution and Index
Design, develop, and optimize ETL/ELT data pipelines using Databricks on AWS.
Collaborate with data scientists, analysts, and stakeholders to understand business requirements and provide technical solutions.
Implement big data processing solutions using Spark on Databricks.
Manage and maintain Databricks clusters and optimize resource utilization to improve performance.
Develop and maintain CI/CD pipelines for data ingestion and transformation processes.
Integrate Databricks with other AWS services such as S3, Redshift, Glue, Athena, and Lambda.
Implement and manage data lakes using AWS S3 and Delta Lake architecture.
Ensure data quality, governance, and security by implementing best practices.
Monitor, troubleshoot, and optimize Databricks jobs and clusters for performance and cost efficiency.
Automate workflows and support real-time data streaming processes using Kafka, Kinesis, or AWS Glue.
Work with DevOps teams to manage infrastructure as code using tools like Terraform or AWS CloudFormation.
Strong experience with Databricks on AWS (2+ years).
Proficiency in Apache Spark and distributed data processing.
Hands-on experience with AWS services such as S3, Redshift, EC2, Lambda, Glue, and EMR.
Expertise in Python, SQL, and Scala for data processing.
Experience with Delta Lake and building data lakes.
Familiarity with CI/CD pipelines using tools such as Jenkins, Git, or CodePipeline.
Experience with infrastructure as code (IaC) tools like Terraform or AWS CloudFormation.
Knowledge of data governance, data security, and compliance frameworks.
Strong analytical and problem-solving skills, with the ability to optimize complex data workflows.
Excellent communication skills and the ability to work in a fast-paced, collaborative environment.

Company

Tredence Inc.

company-logo
Tredence is a global data science solutions provider focused on solving the last mile problem in AI.

H1B Sponsorship

Tredence Inc. has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2023 (63)
2022 (91)
2021 (73)
2020 (60)

Funding

Current Stage
Late Stage
Total Funding
$205M
Key Investors
Advent InternationalChicago Pacific Founders
2022-12-22Series B· $175M
2020-12-10Series A· $30M

Leadership Team

leader-logo
Shub Bhowmick
Chief Executive Officer & Co-Founder
linkedin
leader-logo
Shashank Dubey
Chief Revenue Officer and Co-founder
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot