Principal DevOps Engineer - Hybrid jobs in United States
cer-icon
Apply on Employer Site
company-logo

hackajob · 6 hours ago

Principal DevOps Engineer - Hybrid

hackajob is collaborating with BMC Software to connect them with exceptional tech professionals for this role. We are looking for a Principal DevOps Engineer to help build and operate our next-generation Agentic-AI Data Management platform from 0-1, focusing on reliability and automation in production systems.

Artificial Intelligence (AI)Generative AIHuman ResourcesRecruitingSoftware

Responsibilities

Design, build, and operate the core cloud and Kubernetes-based platform that underpins a 0-1 data automation and management product, taking infrastructure and operational capabilities from concept through production
Write production-grade automation in Python, Go, or similar languages to eliminate manual work across provisioning, deployment, scaling, monitoring, and incident response
Design and evolve Kubernetes-based platforms using Docker, Helm, and cloud-native services, balancing speed of delivery with long-term operability and cost control
Establish and enforce SRE best practices including SLIs/SLOs, alerting strategies, error budgets, incident management, and post-incident reviews to ensure enterprise-grade reliability
Build and maintain robust CI/CD pipelines (e.g., GitHub Actions, Jenkins) to support frequent, safe, and repeatable deployments across multiple environments
Manage cloud environments in accordance with company security guidelines, embedding security, compliance, and access controls directly into infrastructure and pipelines
Build and maintain internal tools, services, and automation that support deployment, observability, debugging, and operational excellence while reducing human error
Support deployments across AWS including integrations with enterprise systems and geographically redundant, highly available services
Work closely with product engineering teams to design operable systems, influence architectural decisions, and ensure production realities inform development choices early
Act with strong ownership: identify operational gaps, propose pragmatic solutions, and move work forward without waiting for perfect requirements or ideal conditions

Qualification

KubernetesCloud ArchitectureCI/CD PipelinesAutomation PythonAutomation GoSRE PracticesSecurity ComplianceOperational ToolingProduct CollaborationFounder-Level Ownership

Required

Experience in building and operating cloud and Kubernetes-based platforms
Hands-on experience with automation in Python, Go, or similar languages
Knowledge of Docker, Helm, and cloud-native services
Experience with SRE best practices including SLIs/SLOs, alerting strategies, error budgets, incident management, and post-incident reviews
Experience in building and maintaining CI/CD pipelines (e.g., GitHub Actions, Jenkins)
Ability to manage cloud environments in accordance with security guidelines
Experience in building and maintaining internal tools, services, and automation for operational excellence
Experience with AWS deployments and integrations with enterprise systems
Strong collaboration skills with product engineering teams
Demonstrated ownership and ability to identify operational gaps and propose solutions

Company

hackajob

twittertwittertwitter
company-logo
The AI-native tech hiring platform trusted by enterprises, scale-ups, and 1M+ tech professionals worldwide.

Funding

Current Stage
Growth Stage
Total Funding
$33M
Key Investors
Volition CapitalDowning VenturesTechstars
2023-05-03Series B· $25M
2018-10-25Series A· $6.7M
2017-03-31Seed· $0.58M

Leadership Team

leader-logo
Mark Chaffey
CEO
linkedin
leader-logo
Phil Kell
VP - Marketplace
linkedin
Company data provided by crunchbase