Director of Engineering, Production Infrastructure jobs in United States
cer-icon
Apply on Employer Site
company-logo

Klaviyo · 1 week ago

Director of Engineering, Production Infrastructure

Klaviyo is a company that empowers creators to own their growth through innovative technology and data solutions. The Director of Engineering, Production Infrastructure will lead the development of a product-quality platform that enhances developer velocity, system reliability, and overall business impact by defining platform primitives and ensuring operational excellence.

AdvertisingAnalyticsE-CommerceMarketing AutomationSoftware
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Own the Production Infrastructure charter. Define the platform primitives Klaviyo provides (compute runtimes, data storage options, messaging/eventing, service networking, observability) and the clear “contract” for each: APIs, SLIs/SLOs, support model, and runbooks ensuring consistency with our company wide operational excellence best practices
Publish golden paths and decision trees that make default choices obvious (e.g., “run X here,” “store a bit here,” “expose data to frontend via Y”), minimizing one‑off work and increasing self‑service
Raise reliability and safety bars across production: incident prevention and response (blameless postmortems, on‑call health), change management, capacity planning, and resilient multi‑tenant patterns
Accelerate developer velocity by improving time‑to‑first‑service, deployment lead time, and mean time to recovery; partner with product teams to remove infrastructure bottlenecks and reduce cognitive load
Engineer for cost‑effectiveness at scale. Establish clear cost guardrails, usage quotas, and right‑sizing policies; partner with Finance and Security to balance spend, risk, and speed
Lead and grow high‑performing teams of managers and senior ICs; set crisp goals, coach for impact, and cultivate an inclusive, ownership‑driven culture
Partner cross‑functionally with engineering leaders, security, and others to sequence investments, clarify ownership boundaries, and land platform changes safely
Measure what matters. Define and report a concise scorecard (e.g., SLO coverage, incident frequency/severity, lead time for changes, MTTR, developer NPS for platform, infra cost‑to‑serve)
Transform workflows by putting AI at the center, building smarter systems and ways of working from the ground up; continuously experiment with AI tools and share learnings to keep the org ahead of the curve

Qualification

Production platformsSRE practicesInfrastructure managementPublic cloud (AWS/GCP)Container orchestrationData storage optionsIncident managementPeople leadershipAI experimentationSystem thinking

Required

Platform‑minded, execution‑oriented leader with a track record building and operating production platforms at scale (e.g., multi‑tenant compute, storage, networking, CI/CD, observability). You prioritize measurable outcomes such as: reliability, efficiency, and developer productivity
Experienced people leader: 10+ years in infrastructure/SRE/platform engineering, including 5+ years managing managers and senior ICs; you set high bars, coach well, and build inclusive teams
Reliability first. Deep familiarity with SRE practices, SLO/SLI design, incident management, capacity planning, and operational readiness
Great system thinker & communicator. You reduce ambiguity, create clarity in docs and diagrams, and influence across product, data, and security to land org‑wide changes
Outcome‑driven and accountable. You set crisp goals, instrument the work, and hold teams to impact not just activity. You're comfortable saying “no” and narrowing scope to ship
AI‑curious and hands‑on. You've already experimented with AI in work or personal projects and are eager to learn fast, using AI responsibly to make your team's work smarter and more efficient
Technical stack familiarity (mix of): public cloud (AWS/GCP), container orchestration, service meshes/ingress, data stores (SQL/NoSQL/object), eventing/streaming, IaC, and modern observability

Preferred

Experience productizing internal platforms (treating infra as a product with SLAs, roadmaps, and developer experience metrics)
Background in data or event‑driven architectures at scale; prior partnership with a centralized data platform (e.g., KDP) to define clean ownership boundaries
Prior success improving cost‑to‑serve and reliability in a high‑growth SaaS environment

Benefits

Participation in the company’s annual cash bonus plan
Variable compensation (OTE) for sales and customer success roles
Equity
Sign-on payments
A comprehensive range of health, welfare, and wellbeing benefits based on eligibility

Company

Klaviyo is an automation and email platform designed to help grow businesses.

H1B Sponsorship

Klaviyo has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (47)
2024 (29)
2023 (24)
2022 (27)
2021 (21)
2020 (8)

Funding

Current Stage
Public Company
Total Funding
$1.35B
Key Investors
ShopifySands Capital VenturesAccel
2025-08-13Post Ipo Secondary· $195.06M
2025-05-14Post Ipo Secondary· $372.95M
2023-09-20IPO

Leadership Team

leader-logo
Andrew Bialecki
CEO
linkedin
leader-logo
Ed Hallen
Co-Founder, Chief Strategy Officer, Board Member
linkedin
Company data provided by crunchbase