Lead Data Engineer (Data Scientist) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Straive ยท 1 day ago

Lead Data Engineer (Data Scientist)

Straive is a global leader in enterprise-grade Data Analytics and AI solutions, committed to empowering businesses across various industries with cutting-edge technology and expert insights. The Lead Data Engineer will architect and lead the implementation of cloud-native, AI-powered data platforms built on Databricks, driving technical strategy and mentoring engineers while partnering closely with data science and product teams.

CRMOutsourcing

Responsibilities

Architect and lead implementation of Databricks Lakehouse solutions
Build scalable ETL/ELT pipelines using Spark, Delta Lake, and Databricks Workflows
Design and implement AI-powered pipelines using Databricks AI and Agent Bricks
Develop agent-based workflows for data ingestion, validation, enrichment, and decision automation
Integrate ML models, LLMs, and AI agents into data and analytics pipelines
Establish best practices for MLOps, AgentOps, CI/CD , and data quality
Optimize performance, reliability, and cost of Databricks workloads
Lead architecture reviews, code reviews, and technical design sessions
Mentor and guide data engineers on modern AI-driven data engineering patterns

Qualification

DatabricksApache SparkData EngineeringAI-powered pipelinesSQLData modelingCloud storageData governanceTeam buildingTechnical leadershipMentoringCommunication skills

Required

8+ years of experience in data engineering, with leadership experience
Strong hands-on expertise with Databricks (AWS/Azure/GCP)
Advanced knowledge of Apache Spark (PySpark / Scala) and Delta Lake
Proven experience with Databricks AI, including: MLflow (experimentation, model registry, deployment), Feature Store, Model monitoring and lifecycle management
Hands-on experience with Databricks Agent Bricks, including: Designing and orchestrating agent-based data workflows, Building AI agents that interact with datasets, metadata, and pipelines, Integrating LLM-powered agents into Databricks workflows
Strong SQL skills and data modeling experience
Experience with cloud storage and security (S3/ADLS/GCS, IAM)
Solid understanding of data governance, lineage, and access controls
Strong technical leadership and architectural decision-making skills
Ability to communicate complex AI and data concepts to non-technical stakeholders
Passion for mentoring and building high-performing teams
Comfortable operating in fast-paced, evolving AI ecosystems

Preferred

Experience with Mosaic AI, Vector Search, or GenAI applications on Databricks
Familiarity with streaming platforms (Kafka, Delta Live Tables)
Experience with agent orchestration patterns (tool calling, memory, evaluation)
Exposure to regulated industries (healthcare, genomics, finance)
Databricks and/or cloud certifications

Company

Straive is a global provider of technology-driven content and data services.

Funding

Current Stage
Late Stage
Total Funding
unknown
2021-08-20Acquired

Leadership Team

leader-logo
Ankor Rai
Chief Executive Officer
linkedin
leader-logo
Lori Silverstein
SVP Global Sales, Data and Content Solutions
linkedin
Company data provided by crunchbase