disney · 1 month ago
Senior Systems Reliability Engineer
The Walt Disney Company is a world-class entertainment and technological leader. The Senior Systems Reliability Engineer is responsible for ensuring the stability, scalability, and performance of mission-critical systems that support Disney’s innovative entertainment experiences, while collaborating across engineering and operations to architect resilient solutions and drive continuous improvement.
EventsNewsPublishing
Responsibilities
Administer Windows and Linux servers supporting automation and industrial applications (e.g. Ignition, FactoryTalk, Copia, Coverity)
Collaborate closely with engineering and project teams to implement CI pipeline automation to streamline PLC testing
Develop tools or scripts to automate documentation generation
Define, measure, and monitor service-level indicators/objectives (SLIs/SLOs) and manage error budgets for critical services
Manage Kubernetes clusters and Helm charts deployments for automation and monitoring applications
Identify and automate manual operational processes (“toil”) within project teams to improve reliability
Ensure high availability, scalability, and disaster recovery readiness for OT (Operational Technology) related systems
Qualification
Required
Minimum of 5+ years in production system reliability (web, cloud, OT, or embedded)—including at least 2 years with industrial or embedded control systems
Hands-on experience managing Kubernetes clusters and Helm-based deployments
Understand how to install and configure operating systems, specifically with expertise in Linux and Windows Server
Software Development Continuous Integration (CI) expertise in GitLab CI or similar
Experience with Source Control Management systems (Git)
Experience in AWS or other cloud platform
Advanced skills in at least one programming language such as Python, PHP, Ruby, Java, Go, Swift or C++ and able to build unit test suites for all software being developed
Excellent verbal and written communication to all levels in the organization
Communication of ideas and solutions in a clear and organized manner
Clear and effective presentations to groups of people
Construction of concise and complete technical documentation
Bachelor's degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience
Preferred
Experience supporting industrial automation platforms (Ignition, FactoryTalk, Copia, etc.)
Experience with multiple public cloud platforms (AWS, Azure, GCP)
Full stack web development experience
Demonstrates curiosity and continuous learning and self-improvement
Ability to influence architectural decisions and advocate for best reliability practices
Skills in Datadog monitoring and alerting and instrumentation with OpenTelemetry
Contributions to reliability-related open-source projects or technical communities
Benefits
A bonus and/or long-term incentive units may be provided as part of the compensation package
The full range of medical, financial, and/or other benefits
Company
disney
disney.com
H1B Sponsorship
disney has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (83)
2024 (63)
2023 (96)
2022 (130)
2021 (30)
2020 (40)
Funding
Current Stage
Early StageTotal Funding
unknown2012-01-20Acquired
Company data provided by crunchbase