Randstad Digital Americas ยท 2 days ago
Site Reliability Engineer
Randstad Digital Americas is seeking a Site Reliability Engineer who possesses a strong desire to learn new technologies and collaborate effectively with teams. The role involves managing and supporting distributed systems, cloud development, and implementing observability practices to ensure platform resiliency.
Information Technology & Services
Responsibilities
Bachelor's degree or higher in a technology related field (like Engineering, Computer Science, Information Technology)
A minimum of 5+ years of hybrid experience in Production Support, Development and SRE Experience. Hands-On experience deploying and/or supporting highly distributed multi-tiered systems at scale
A minimum of 5+ years of experience in cloud development (AWS) and migration skills; Experience with building and operating highly resilient platforms in AWS Cloud Environments
3 - 5+ years of experience in software development with Python, NodeJS, Java with a focus on SDLC and automation
A self-starter and teammate who can independently manage multiple responsibilities in a dynamic environment
Strong hands-on experience and ability to automate with various scripting languages such as Python, Shell Scripting, etc
Solid understanding of Cloud Computing and DevOps concepts including CI/CD Pipelines
3 - 5+ years of Hands-On Kubernetes skills and knowledge including support and app deployment experience
Expert and hands on experience with one or more Observability tools (Prometheus, Grafana, ELK/OpenSearch, Open Telemetry, Datadog, Splunk)
Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale
Proven experience in maintaining scalability and resiliency in complex environments
Proven experience in implementing advanced observability practices and techniques at scale
Qualification
Required
Bachelor's degree or higher in a technology related field (like Engineering, Computer Science, Information Technology)
A minimum of 5+ years of hybrid experience in Production Support, Development and SRE Experience. Hands-On experience deploying and/or supporting highly distributed multi-tiered systems at scale
A minimum of 5+ years of experience in cloud development (AWS) and migration skills; Experience with building and operating highly resilient platforms in AWS Cloud Environments
3 - 5+ years of experience in software development with Python, NodeJS, Java with a focus on SDLC and automation
A self-starter and teammate who can independently manage multiple responsibilities in a dynamic environment
Strong hands-on experience and ability to automate with various scripting languages such as Python, Shell Scripting, etc
Solid understanding of Cloud Computing and DevOps concepts including CI/CD Pipelines
3 - 5+ years of Hands-On Kubernetes skills and knowledge including support and app deployment experience
Expert and hands on experience with one or more Observability tools (Prometheus, Grafana, ELK/OpenSearch, Open Telemetry, Datadog, Splunk)
Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale
Proven experience in maintaining scalability and resiliency in complex environments
Proven experience in implementing advanced observability practices and techniques at scale
Ability to triage, perform root cause analysis, and be decisive under pressure
Experience managing and interpreting large datasets using query languages and visualization tools
Excellent verbal, written communication skills and ability to tailor them to various audiences
Preferred
AWS and AWS / EKS certifications are a plus
Benefits
Medical
Prescription
Dental
Vision
AD&D
Life insurance offerings
Short-term disability
401K plan
Company
Randstad Digital Americas
Randstad Digital is a trusted digital enablement partner that facilitates accelerated transformation for businesses by providing global talent, capacity, and solutions across specialized domains.