Cystems Logic · 5 hours ago
Site Reliability Engineer (SRE) - San Jose CA (HYBRID)
Cystems Logic is seeking a Site Reliability Engineer (SRE) to support a government agency by ensuring the reliability, scalability, and performance of critical systems across on-prem and cloud environments. The role requires strong expertise in Linux, Kubernetes, AWS, and automation, with responsibilities including infrastructure automation, incident management, and CI/CD enablement.
Information Technology & Services
Responsibilities
Extensive experience working with Linux flavors like rhel/centos os shells filesystems and utilities
Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and rancher Kubernetes and good knowledge on Kubernetes objects
Experience working with Storage ONTAP is preferable: volume aggregates back ups DR planning
Creating and supporting automation scripts (shell ansible python) for infrastructure deployments validations and monitoring to improve operational tasks
Experience scheduling monitoring scripts using cron and airlfow
Experience with monitoring tools including Dynatrace Apica Grafana etc
Database knowledge including sql and noSQL dbs
Experience building CICD pipelines (preferred)
Cloud platform knowledge (specifically AWS) is required
Incident handling and problem management
Experience in AWS ECS and EKS is added advantage
Experience in Dremio is added advantage
Experience in Dynatrace or any tracing infrastructure or real time monitoring tool is added advantage
Experience in SPARK and maintaining SPARK CLUSTER
Qualification
Required
Extensive experience working with Linux flavors like rhel/centos os shells filesystems and utilities
Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and rancher Kubernetes and good knowledge on Kubernetes objects
Creating and supporting automation scripts (shell ansible python) for infrastructure deployments validations and monitoring to improve operational tasks
Experience scheduling monitoring scripts using cron and airflow
Experience with monitoring tools including Dynatrace Apica Grafana etc
Database knowledge including sql and noSQL dbs
Cloud platform knowledge (specifically AWS) is required
Incident handling and problem management
Preferred
Experience building CICD pipelines
Experience in AWS ECS and EKS is added advantage
Experience in Dremio is added advantage
Experience in Dynatrace or any tracing infrastructure or real time monitoring tool is added advantage
Experience in SPARK and maintaining SPARK CLUSTER
Company
Cystems Logic
Cystems Logic empowers businesses to achieve greater efficiency and drive growth through our comprehensive suite of 360° value-driven technology solutions.
Funding
Current Stage
Growth StageCompany data provided by crunchbase