Cloud Reliability Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

CACI International Inc · 5 months ago

Cloud Reliability Engineer

CACI International Inc is seeking a Cloud Reliability Engineer to drive the design, build, and support of cloud-native infrastructure and platform services for critical Department of Defense mission systems. This role focuses on ensuring product reliability and enhancing user outcomes through collaboration with various teams and the implementation of service level objectives.

Information TechnologyService IndustrySoftware
check
Comp. & Benefits
badNo H1BnoteSecurity Clearance Requirednote

Responsibilities

Engineer Cloud-Native Platforms: Design, deploy, and maintain robust Kubernetes clusters and supporting services across AWS GovCloud and Azure
Drive User-Centric Reliability: Collaborate to understand Critical User Journeys (CUJs). Define and implement product-level Service Level Objectives (SLOs), focusing on user-visible behaviors and outcomes (availability, latency, etc.)
Automate Everything: Provision, configure, and monitor platforms using Infrastructure-as-Code (Terraform, CloudFormation)
Enhance Observability: Implement and leverage telemetry and request-level annotation to directly link infrastructure requests to product functionality and mission partner objectives
Secure & Comply: Manage identity, access, patching, logging, and backups in multi-tenant environments, integrating RMF, Zero Trust, and IL5+ hardening into platform design
Troubleshoot with Impact: Prioritize and resolve platform service and infrastructure issues based on user impact and product criticality
Collaborate & Document: Work within Agile teams, contribute to user objective refinement, and maintain comprehensive system documentation

Qualification

AWS GovCloudKubernetesInfrastructure-as-CodeCloud SecurityBashPythonCI/CD toolsGit/GitLabUser-focused SLOsTroubleshooting skillsAgile teamsCertificationsCommunicationDocumentation

Required

Active TS/SCI Clearance
Bachelor's degree in a technical field with 3+ years of relevant experience
Deep expertise with AWS (GovCloud/SC2S), Kubernetes (EKS or self-managed), Linux, and CI/CD tools
Proficiency in Bash or Python
Hands-on experience with Git/GitLab, container registries, and infrastructure monitoring
Strong understanding of cloud security, IAM, networking, and platform lifecycle management
Proven ability to translate user needs into measurable reliability targets and implement user-focused SLOs
Excellent communication and troubleshooting skills, with a focus on end-user experience and product reliability
Solid grasp of cloud networking, load balancing, and DNS
Certifications: CompTIA Cloud+ or Security+; GICSP, SSCP, or GSEC

Preferred

Master's degree in a technical discipline
Experience with Air Force or DoD platform infrastructure environments (e.g., Platform One, Iron Bank, Big Bang)
Familiarity with Atlassian tools and DevSecOps workflows

Benefits

Healthcare
Wellness
Financial
Retirement
Family support
Continuing education
Time off benefits

Company

CACI International Inc

company-logo
At CACI International Inc (NYSE: CACI), our 25,000 talented and dynamic employees are ever vigilant in delivering distinctive expertise and technology to meet our customers’ greatest challenges in national security.

Funding

Current Stage
Public Company
Total Funding
$1B
2025-05-21Post Ipo Debt· $1B
2003-01-10IPO

Leadership Team

leader-logo
John Mengucci
President & CEO
linkedin
leader-logo
Darryl W Burke
Senior Vice President / Air Force Client Executive
linkedin
Company data provided by crunchbase