Programming.com · 18 hours ago
Site Reliability Engineer
Programming.com is seeking a Senior Site Reliability Engineer (SRE) with expertise in AWS and Kubernetes. The role involves designing and operating highly available systems for banking and payments, leading SRE practices, and supporting Java microservices on Kubernetes.
Responsibilities
Design and operate highly available, fault-tolerant systems for banking, payments, and trading platforms
Lead SRE practices: SLIs, SLOs, error budgets, RCA, post-incident remediation, L3/L4 on-call support
Support Java microservices on Kubernetes (EKS); optimize performance, scalability, and latency
Strong AWS experience: EC2, EKS, IAM, VPC, RDS, DynamoDB, S3, CloudWatch
Infrastructure automation using Terraform; scripting with Python, Go, Bash
Kubernetes networking, storage, and service mesh: Istio, Anthos Service Mesh, Portworx, multi-cluster/federation
CI/CD with GitLab CI/CD, Jenkins; zero-downtime deployments and DR strategies
Observability using Prometheus, Datadog, Splunk, Kiali, eBPF for deep system visibility
Real-time streaming: Kafka, KSQLDB, Kafka Streams, Spark Streaming
Security & compliance: IAM, secrets management, SOC2, PCI-DSS, SOX, banking-grade controls
Strong Linux/Unix, Docker, VMware, networking tools (Nginx Controller, Seesaw)
Experience with high-frequency transaction systems and regulated environments
Qualification
Required
Design and operate highly available, fault-tolerant systems for banking, payments, and trading platforms
Lead SRE practices: SLIs, SLOs, error budgets, RCA, post-incident remediation, L3/L4 on-call support
Support Java microservices on Kubernetes (EKS); optimize performance, scalability, and latency
Strong AWS experience: EC2, EKS, IAM, VPC, RDS, DynamoDB, S3, CloudWatch
Infrastructure automation using Terraform; scripting with Python, Go, Bash
Kubernetes networking, storage, and service mesh: Istio, Anthos Service Mesh, Portworx, multi-cluster/federation
CI/CD with GitLab CI/CD, Jenkins; zero-downtime deployments and DR strategies
Observability using Prometheus, Datadog, Splunk, Kiali, eBPF for deep system visibility
Real-time streaming: Kafka, KSQLDB, Kafka Streams, Spark Streaming
Security & compliance: IAM, secrets management, SOC2, PCI-DSS, SOX, banking-grade controls
Strong Linux/Unix, Docker, VMware, networking tools (Nginx Controller, Seesaw)
Experience with high-frequency transaction systems and regulated environments
AWS Solutions Architect – Professional or AWS DevOps Engineer – Professional
CKA or CKS