SIGN IN
Senior Data Platform Engineer - Apache Druid - Sunnyvale, CA - ONSITE jobs in United States
cer-icon
Apply on Employer Site
company-logo

MokshaaLLC · 17 hours ago

Senior Data Platform Engineer - Apache Druid - Sunnyvale, CA - ONSITE

MokshaaLLC is seeking a Senior Data Platform Engineer specializing in Apache Druid. The role involves managing cluster setup, configuration, and production operations while ensuring optimal performance and availability for data ingestion and querying.
ConsultingSoftwareHuman ResourcesInformation TechnologyRecruitingStaffing Agency
badNo H1Bnote
Hiring Manager
Dhivya P
linkedin

Responsibilities

Core Technical Skills
Apache Druid
O Cluster setup, configuration, and production operations
O Real-time and batch ingestion (Kafka, streaming tasks, indexing services)
O Segment management, compaction, retention, and query optimization
O Troubleshooting performance and availability issues
Trino
O Cluster deployment and tuning for large-scale distributed queries
O Connector configuration (Hive, Iceberg, Delta Lake, JDBC, etc.)
O Query optimization, memory management, and workload isolation
O Security configuration (authentication, authorization, access control)
Python
O Strong proficiency in Python for automation and backend services
O Writing clean, maintainable, production-grade code
O Building tooling for deployment, monitoring, and operational workflows
O Experience with REST APIs, scripting, and data processing libraries
DevOps & Platform Engineering
Containerization & Orchestration
O Docker image creation and optimization
O Kubernetes deployment, scaling, and troubleshooting
O Helm charts and Kubernetes operators (preferred)
Infrastructure & CI/CD
O Infrastructure as Code using Terraform, CloudFormation, or similar
O CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.)
O Blue-green and rolling deployment strategies
Cloud Platforms
O Hands-on experience with AWS, GCP, or Azure
O Networking, storage, and compute optimization for data workloads
O Cost monitoring and optimization
Observability & Operations
O Monitoring and alerting using Prometheus, Grafana, ELK, OpenTelemetry, or similar
O Log aggregation, metrics, and distributed tracing
O Incident management, root cause analysis, and postmortems
O Capacity planning and performance benchmarking

Qualification

Apache DruidTrinoPythonContainerization & OrchestrationInfrastructure & CI/CDCloud PlatformsObservability & Operations

Required

Apache Druid: Cluster setup, configuration, and production operations
Apache Druid: Real-time and batch ingestion (Kafka, streaming tasks, indexing services)
Apache Druid: Segment management, compaction, retention, and query optimization
Apache Druid: Troubleshooting performance and availability issues
Trino: Cluster deployment and tuning for large-scale distributed queries
Trino: Connector configuration (Hive, Iceberg, Delta Lake, JDBC, etc.)
Trino: Query optimization, memory management, and workload isolation
Trino: Security configuration (authentication, authorization, access control)
Python: Strong proficiency in Python for automation and backend services
Python: Writing clean, maintainable, production-grade code
Python: Building tooling for deployment, monitoring, and operational workflows
Python: Experience with REST APIs, scripting, and data processing libraries
Containerization & Orchestration: Docker image creation and optimization
Containerization & Orchestration: Kubernetes deployment, scaling, and troubleshooting
Infrastructure & CI/CD: Infrastructure as Code using Terraform, CloudFormation, or similar
Infrastructure & CI/CD: CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.)
Infrastructure & CI/CD: Blue-green and rolling deployment strategies
Cloud Platforms: Hands-on experience with AWS, GCP, or Azure
Cloud Platforms: Networking, storage, and compute optimization for data workloads
Cloud Platforms: Cost monitoring and optimization
Observability & Operations: Monitoring and alerting using Prometheus, Grafana, ELK, OpenTelemetry, or similar
Observability & Operations: Log aggregation, metrics, and distributed tracing
Observability & Operations: Incident management, root cause analysis, and postmortems
Observability & Operations: Capacity planning and performance benchmarking

Preferred

Helm charts and Kubernetes operators (preferred)

Company

MokshaaLLC

twittertwitter
company-logo
Information Technology Firm

Funding

Current Stage
Growth Stage
Company data provided by crunchbase