Upbound · 2 months ago
Senior Production Engineer (REMOTE)
Upbound is the company behind Crossplane, leading the shift toward agentic infrastructure. They are hiring a Senior Production Engineer to enhance the reliability and availability of Upbound Cloud, collaborating with engineering and product teams to ensure a performant and scalable platform.
Cloud ComputingInformation ServicesInformation TechnologySoftware
Responsibilities
Contribute to the production engineering strategy for Upbound Cloud, ensuring high availability, scalability, and efficiency of all customer-facing systems. This includes internalizing the product strategy and developing levels of system resiliency to support product growth
Own reliability metrics — including uptime, latency, and error budgets — and champion service-level objectives (SLOs) across teams
Design and implement automation for provisioning, observability, and incident response to minimize human intervention and increase operational maturity
Collaborate with development teams to build reliability into the software lifecycle through proactive architectural reviews, chaos testing, and performance profiling
Operate and improve multi-tenant Kubernetes-based systems, leveraging Crossplane, and other cloud-native tooling
Drive incident management — leading blameless postmortems, root cause analyses, and systemic remediation efforts
Mentor engineers in production engineering practices, fostering a culture of ownership, reliability, and continuous improvement
Contribute to the evolution of our cloud platform through design input, tool selection, and scalable systems thinking
Qualification
Required
5+ years of experience in software, infrastructure, or site reliability engineering roles
Strong background in distributed systems, service-oriented architectures, and cloud-native technologies
Proficiency in Kubernetes, Go, and Infrastructure-as-Code strategies
Expertise in observability and monitoring preferably Honeycomb and OpenTelemetry
Experience managing large-scale SaaS systems in production with multi-region and high-availability requirements
Strong understanding of incident response, capacity planning, and change management
Excellent communication skills and ability to collaborate across functions
Preferred
Experience with Crossplane, multi-cloud infrastructure, or control-plane architectures
Prior leadership experience driving reliability initiatives at scale
Company
Upbound
Upbound is an infrastructure management platform that runs, scales, and optimizes services across multiple cloud environments.
Funding
Current Stage
Growth StageTotal Funding
$69MKey Investors
Altimeter CapitalGoogle Ventures
2021-11-29Series B· $60M
2018-05-02Series A· $9M
Recent News
The Motley Fool
2026-01-06
2025-11-07
Company data provided by crunchbase