F5 · 13 hours ago
Sr. Software Engineer, Cloud SRE and Automation
F5 is dedicated to creating a better digital world, empowering organizations to secure and run applications effectively. The Senior Software Engineer specializing in Cloud SRE and Automation will build reliable cloud infrastructure, implement automation, and drive operational excellence.
Consumer ElectronicsSaaSSecurityVirtualization
Responsibilities
Design, develop, and implement cloud-native automation and remediation services across AWS, Azure, and GCP platforms
Build and maintain highly available, scalable infrastructure using Infrastructure as Code (Terraform, CloudFormation, ARM Templates)
Develop and optimize cloud architectures for reliability, performance, and cost efficiency
Implement and manage Kubernetes-based containerized workloads across multi-cloud environments
Design and build self-healing systems with automated remediation and closed-loop automation
Create and maintain observability pipelines, monitoring solutions, and alerting systems across cloud platforms
Develop cloud-native CI/CD pipelines for rapid, reliable application and infrastructure deployments
Apply Site Reliability Engineering principles including SLIs, SLOs, SLAs, and error budgets to cloud services
Design and implement runbook automation frameworks for incident response and operational tasks
Build automation tools and scripts to reduce toil and improve operational efficiency
Develop integration layers with ITSM platforms, incident management systems, and monitoring tools (ServiceNow, PagerDuty, Jira)
Implement chaos engineering and resilience testing to validate system reliability
Perform capacity planning, performance tuning, and cost optimization for cloud resources
Participate in incident response, on-call rotations, and conduct blameless postmortems
Implement comprehensive observability solutions using Prometheus, Grafana, OpenTelemetry, CloudWatch, and other tools
Build automated alerting and intelligent runbook triggering based on system metrics and logs
Develop dashboards and metrics to track system health, performance, and reliability
Analyze system behavior and implement predictive analytics for proactive issue detection
Optimize application and infrastructure performance across distributed cloud environments
Work closely with SREs, QA, development teams, and platform engineers to improve reliability and performance
Mentor junior engineers on SRE best practices, cloud architecture, and automation development
Participate in code reviews, technical design discussions, and architecture planning
Contribute to the evolution of SRE culture and practices within the organization
Document automation workflows, runbooks, cloud architectures, and operational procedures
Qualification
Required
6-8 years of software development experience with 3+ years in SRE, DevOps, cloud engineering, or platform engineering roles
Strong programming proficiency in Python
Hands-on experience with multi-cloud environments: AWS, Azure, GCP
Experience with at least two major cloud providers
Strong experience with Kubernetes architecture, deployments, services, and operations
Strong experience with container orchestration (EKS, AKS, GKE)
Strong experience with Docker, containerd, and container image management
Strong experience with Helm, Kustomize for application packaging
Deep understanding of Site Reliability Engineering methodologies (SLIs, SLOs, SLAs, error budgets)
Hands-on experience with Prometheus, Grafana, OpenTelemetry
Strong experience with CI/CD pipelines (Jenkins, GitLab CI/CD, GitHub Actions, ArgoCD)
Excellent troubleshooting, debugging, and analytical skills; strong written and verbal communication abilities
Typically requires a minimum of 10+ years of related experience with a bachelor's degree; or 3+ years and a master's degree
Bachelor's degree in Computer Science, Information Technology, or related field preferred
Preferred
Experience with multi-cloud architecture and hybrid cloud deployments
Cloud migration strategies and implementation
Serverless architectures and event-driven systems
Cloud cost optimization and FinOps practices
Cloud security best practices and compliance frameworks
Experience with Kubernetes operators and custom controllers
Custom Resource Definitions (CRDs)
Kubernetes security, RBAC, and network policies
Multi-cluster and multi-tenant Kubernetes architectures
Understanding or experience with machine learning concepts for IT operations
Anomaly detection and predictive analytics
Intelligent alerting and automated root cause analysis
Log analysis using ML techniques
Experience with eBPF for advanced system observability
Custom OpenTelemetry collectors and instrumentation
Advanced APM tools (New Relic, Dynatrace, AppDynamics)
Network performance monitoring (ThousandEyes, Kentik)
Experience with ChatOps frameworks and integrations
Incident response automation platforms (Shoreline, Resolve)
Event streaming platforms (Kafka, Kinesis, Pub/Sub)
Runbook automation platforms (Rundeck, StackStorm, Ansible Tower/AWX)
Workflow orchestration and job scheduling
Automated remediation and self-healing systems
Event-driven automation frameworks
Contributions to open-source cloud, SRE, or automation projects
Experience with database reliability (MySQL, PostgreSQL, NoSQL)
Knowledge of disaster recovery and high availability patterns
Benefits
Incentive compensation
Bonus
Restricted stock units
Benefits
Company
F5
F5 is a multi-cloud application services and security company that specializes in application security, performance, and delivery.
H1B Sponsorship
F5 has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (155)
2024 (110)
2023 (211)
2022 (194)
Funding
Current Stage
Public CompanyTotal Funding
unknownKey Investors
Elliott Management Corp.
2020-11-08Post Ipo Equity
1999-06-04IPO
1998-09-24Series Unknown
Leadership Team
Recent News
Business Wire
2026-01-16
2026-01-15
2025-12-24
Company data provided by crunchbase