Senior Kafka Platform Engineer (Automation & Kubernetes) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Balyasny Asset Management L.P. ยท 1 day ago

Senior Kafka Platform Engineer (Automation & Kubernetes)

Balyasny Asset Management L.P. is seeking a seasoned Kafka engineer to design, operate, and scale their event streaming platform. The role involves owning the Kafka core, building infrastructure-as-code, and ensuring reliability and performance while partnering with application teams on best practices.

Financial Services
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Architect, deploy, and operate production-grade Kafka clusters (self-managed and/or Confluent/MSK), including upgrades, capacity planning, multi-AZ/region DR, and performance tuning
Run Kafka on Kubernetes using Operators (e.g., Strimzi or Confluent for Kubernetes), Helm, and GitOps; manage StatefulSets, storage, PDBs, affinities, and rolling strategies
Build and maintain automation infrastructure: Terraform/Helm modules, CI/CD pipelines, policy-as-code, and guardrails for repeatable, compliant Kafka provisioning
Implement and manage Kafka Connect, Schema Registry, and MirrorMaker 2/Cluster Linking; standardize connectors (e.g., Debezium) and build self-service patterns
Drive reliability: define SLOs/error budgets, on-call rotations, incident response, postmortems, runbooks, and automated remediation
Implement observability: metrics, logs, traces, lag monitoring, and capacity dashboards (e.g., Prometheus/Grafana, Burrow, Cruise Control, OpenTelemetry)
Secure the platform: TLS/mTLS, SASL (OAuth/SCRAM), RBAC/ACLs, secrets management, network policies, audit, and compliance automation
Guide event-streaming best practices: topic design, partitioning, compaction/retention, idempotency, ordering, schema evolution/compatibility, DLQs, EOS semantics
Partner with app, data, and SRE teams; provide enablement, documentation, and internal tooling for a great developer experience
Lead/mentor engineers and contribute to roadmap, standards, and platform strategy (including ZooKeeper-to-KRaft migrations where applicable)

Qualification

KafkaKubernetesInfrastructure as CodeSecurityObservabilityCloud experienceAutomation toolsProgramming languagesCommunication skillsMentoring

Required

Deep hands-on experience operating Kafka in production at scale (brokers, controllers, partitions, ISR, tiered storage/retention, rebalancing, replication, recovery)
Strong Kubernetes expertise running stateful systems: storage classes, StatefulSets, node/pod tuning, PodDisruptionBudgets, topology spread, network policies
Automation first: Infrastructure as Code (Terraform), Helm, Operators, GitOps (Argo CD/Flux), and CI/CD (e.g., GitHub Actions/Jenkins) for platform lifecycle
Proficiency with one or more languages for tooling/automation: Python, Go, or Java; plus Bash and solid Linux fundamentals (networking, filesystems, JVM tuning basics)
Observability and reliability engineering for Kafka: Prometheus/Grafana, logging, alerting, lag monitoring, capacity/throughput modeling, performance tuning
Security for data in motion: TLS/mTLS, SASL/OAuth, ACL/RBAC, secrets management (e.g., Vault), and audit/compliance practices
Experience with Kafka ecosystem components: Kafka Connect, Schema Registry, MirrorMaker 2/Cluster Linking; familiarity with Cruise Control
Cloud experience (AWS/Azure/GCP) with networking, IAM, and one or more managed offerings (e.g., Confluent Cloud or AWS MSK)
Proven track record designing runbooks, leading incidents/postmortems, and driving platform roadmaps
Excellent communication and partnership skills with platform and application teams

Preferred

Experience migrating ZK-based clusters to KRaft and/or cross-cluster replication designs
Data processing frameworks (Kafka Streams, Flink, Spark Structured Streaming) and EOS semantics
Policy-as-code (OPA/Gatekeeper), secrets rotation automation, and compliance-as-code
Experience with Strimzi or Confluent for Kubernetes in production
Knowledge of CDC patterns and tools (e.g., Debezium) and database connectors at scale
Multi-region architectures, cluster linking strategies, and disaster recovery drills
Service mesh familiarity (mTLS, ingress/egress controls) and advanced network tuning

Company

Balyasny Asset Management L.P.

company-logo
Balyasny Asset Management (BAM) is a diversified global investment firm founded in 2001 by Dmitry Balyasny, Scott Schroeder, and Taylor O'Malley.

H1B Sponsorship

Balyasny Asset Management L.P. has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (107)
2024 (85)
2023 (39)
2022 (44)
2021 (28)
2020 (18)

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
David Black
Managing Director Business Development
linkedin
J
Jeremy Brunelli
Data Science & AI | Managing Director, Head of Geospatial & Alt Data Research
linkedin
Company data provided by crunchbase