SIGN IN
Site Reliability Engineer jobs in United States
info-icon
This job has closed.
company-logo

Randstad Digital Americas · 17 hours ago

Site Reliability Engineer

Randstad Digital Americas is seeking a Site Reliability Engineer to manage and support highly distributed multi-tiered systems. The role involves performing root cause analysis, chaos testing, and automating day-to-day activities using various tools and programming languages.
Information Technology & Services

Responsibilities

Ability to triage, complete root cause analysis, and be decisive under pressure
Experience managing and interpreting large datasets using query languages and visualization tools
Proficient communication skills with an ability to reach both technical and non-technical audience
Proven experience performing chaos testing to build confidence in the system's capability to withstand turbulent conditions in production
Strong understanding in API testing tools (SoapUI, Postman)
Experience managing systems using infrastructure as code tools (IAM, ARM, Terraform, Chef)
Handle a huge fleet of on-prem servers (including security & patching oversight)
Handle hundreds of SSL certificates for all applications in scope
Use Ansible & Python for automating day-to-day activities, Web development with Django, JavaScript

Qualification

Chaos testingInfrastructure as codeAPI testing toolsCloud developmentObservability toolsUnix/Linux troubleshootingCommunication skillsITIL processesDevOps conceptsApplication monitoringInstrumentation skills

Required

Ability to triage, complete root cause analysis, and be decisive under pressure
Experience managing and interpreting large datasets using query languages and visualization tools
Proficient communication skills with an ability to reach both technical and non-technical audience
Proven experience performing chaos testing to build confidence in the system's capability to withstand turbulent conditions in production
Strong understanding in API testing tools (SoapUI, Postman)
Experience managing systems using infrastructure as code tools (IAM, ARM, Terraform, Chef)
Handle a huge fleet of on-prem servers (including security & patching oversight)
Handle hundreds of SSL certificates for all applications in scope
Use Ansible & Python for automating day-to-day activities, Web development with Django, JavaScript
Bachelor's degree or equivalent experience or higher in a technology related field (e.g. Engineering, Computer Science, etc.) required, Master's degree a plus
5-8+ years of hands-on experience deploying and/or supporting highly distributed multi-tiered systems at scale
Hands-on experience with Public Cloud environments, preferably AWS and Azure. Certifications a plus
Exposure to basic OS level scripting languages such as Korn/Bash/Jscript
Experience with container orchestration, preferably with Kubernetes
Experience operating and implementing distributed & highly concurrent service-based
Ability to solve application issues on Unix/Linux with J2EE, WebSphere, Tomcat and SQL
Familiarity with ITIL processes like Incident management, Change/Problem management
Balancing delivery with ad hoc workloads and re-evaluating priorities
Solid understanding of Cloud Computing and DevOps concepts including CI/CD pipelines
Hands on experience with one or more observability tools (Prometheus, Grafana, ELK/OpenSearch, OpenTelemetry, Datadog, etc.)
Use Datadog, Catchpoint, Splunk & Grafana for Application Observability and monitoring of app & infrastructure
Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale
Proven experience in maintaining scalability and resiliency of complex environment
Proven experience in implementing advanced observability practices and techniques at scale
Provide enterprise Cloud and Platform Engineering support for production environments and ability to participate in on-call rotation to provide solutions
Experience in Cloud development (AWS and Azure) and migration skills; Experience with building and operating highly resilient platforms in public cloud environments

Benefits

Medical
Prescription
Dental
Vision
AD&D
Life insurance offerings
Short-term disability
401K plan

Company

Randstad Digital Americas

twitter
company-logo
Randstad Digital is a trusted digital enablement partner that facilitates accelerated transformation for businesses by providing global talent, capacity, and solutions across specialized domains.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Graig Paglieri
CEO, Randstad Digital Americas
linkedin
leader-logo
Pascal de Hesselle
SVP, Executive Client Partner - Technology, Media & Telecom
linkedin
Company data provided by crunchbase