Apply on Employer Site

VARITE INC · 2 months ago

Senior Site Reliability Engineer (SRE)

Phoenix, AZ

Full-time

Onsite

Senior Level, Lead/Staff

$105K/yr - $125K/yr

14+ years exp

VARITE is looking for a qualified Senior Site Reliability Engineer (SRE) for one of its clients located in Phoenix, AZ. The role involves providing senior-level SRE support, ensuring system reliability, and developing automation scripts primarily using Java, while managing cloud infrastructure on Azure and deploying workloads on Kubernetes.

Information Technology & Services

Growth Opportunities

No H1B

U.S. Citizen Only

Responsibilities

Provide senior-level SRE support, ensuring system reliability, availability, and operational excellence across all environments

Develop and maintain services and automation scripts using Java as the primary programming language

Build, deploy, and optimize workloads running on Kubernetes clusters (including multi-cluster and federated deployments)

Manage and enhance cloud infrastructure leveraging Azure services and best practices

Work with Linux/Unix systems and develop automation using BASH shell scripting

Build automation and tooling using Python or Go

Design, implement, and maintain CI/CD pipelines using GitLab CI/CD and Jenkins

Support application streaming, event processing, and analytics using Kafka Stream Generator, KSQLDB, and Spark Streams

Work with service mesh technologies including Istio and understand Anthos Service Mesh

Utilize VMware and other virtualization platforms for environment provisioning

Provide robust incident support, root-cause analysis, and production issue resolution

Implement eBPF-based observability and performance troubleshooting where applicable

Develop and enhance monitoring and alerting systems using Splunk, Prometheus, Datadog, and Kiali

Configure and manage Nginx Controller and Seesaw load-balancing

Use Terraform for infrastructure-as-code and Docker for containerization

Manage Kubernetes storage using Portworx

Automate repetitive operational tasks and contribute to platform stability and efficiency

Provide support across all US time zones, including rotational shifts, weekends, and occasional 24/7 escalations

Qualification

JavaKubernetesAzureIncident responsePythonBASH scriptingDockerTerraformMonitoring toolsGoVMwareKafkaEBPFLoad balancingFunctional languages

Required

14+ Years of experience required

Only USC and GC due to the nature of the project

Provide senior-level SRE support, ensuring system reliability, availability, and operational excellence across all environments

Develop and maintain services and automation scripts using Java as the primary programming language

Build, deploy, and optimize workloads running on Kubernetes clusters (including multi-cluster and federated deployments)

Manage and enhance cloud infrastructure leveraging Azure services and best practices

Work with Linux/Unix systems and develop automation using BASH shell scripting

Build automation and tooling using Python or Go

Design, implement, and maintain CI/CD pipelines using GitLab CI/CD and Jenkins

Support application streaming, event processing, and analytics using Kafka Stream Generator, KSQLDB, and Spark Streams

Work with service mesh technologies including Istio and understand Anthos Service Mesh

Utilize VMware and other virtualization platforms for environment provisioning

Provide robust incident support, root-cause analysis, and production issue resolution

Implement eBPF-based observability and performance troubleshooting where applicable

Develop and enhance monitoring and alerting systems using Splunk, Prometheus, Datadog, and Kiali

Configure and manage Nginx Controller and Seesaw load-balancing

Use Terraform for infrastructure-as-code and Docker for containerization

Manage Kubernetes storage using Portworx

Automate repetitive operational tasks and contribute to platform stability and efficiency

Provide support across all US time zones, including rotational shifts, weekends, and occasional 24/7 escalations

Extensive experience in incident response, troubleshooting, performance engineering, and service reliability

Ability to automate manual operational tasks

Strong understanding of monitoring, alerting, and observability practices

Java (Proficient) – Must be hands-on in building, supporting, and optimizing Java-based systems and microservices

Kubernetes (Hands-on) – Deployment, autoscaling, federation, ingress, storage, service mesh, and cluster operations

Azure (Highly Proficient) – Strong experience across Azure compute, networking, storage, DevOps, and security features

Knowledge of Linux/Unix internals and BASH scripting

Strong experience with Python or Go

VMware and virtualization technologies

Kafka ecosystem tools: Kafka Stream Generator, KSQLDB, Spark Streams

Experience with Istio/Anthos Service Mesh

Familiarity with eBPF for low-level observability

Monitoring tools: Splunk, Prometheus, Datadog, Kiali

Load balancing with Nginx Controller and Seesaw

Docker and Terraform expertise

Experience working with Portworx for Kubernetes storage

Preferred

Functional languages proficiency: Prolog, Haskell, OCaml

Company

VARITE INC

VARITE has a definite spirit.

Founded in 2000

San Jose, California, US

1001-5000 employees

http://www.varite.com

Funding

Current Stage

Late Stage

Leadership Team

Adarsh Katyal

President & CEO

Sue Patel Arora

Vice President Of Strategic Partnerships

Company data provided by crunchbase