Apply on Employer Site

Choice Hotels International · 14 hours ago

Staff Software Engineer - Resiliency and Platform Engineering

Scottsdale, AZ

Full-time

Hybrid

Lead/Staff

8+ years exp

Choice Hotels International is seeking a Staff Software Engineer for their SkyTouch Technology division, which provides a cloud-based hotel property management system. The role focuses on enhancing the resiliency and operability of a large-scale SaaS platform by improving foundational capabilities and developer experience.

HospitalityHotelTravel

No H1B

Responsibilities

Design and implement platform-level capabilities including shared libraries, frameworks, tooling, automation, and guardrails that improve application resiliency, runtime safety, and developer experience across the ecosystem, favoring leverage and durability over short-term delivery

Strengthen foundational platform and runtime behavior by identifying and eliminating systemic failure modes such as JVM memory leaks, unsafe defaults, brittle error handling, poor failure propagation, and resource exhaustion

Improve how software is built and operated at scale by defining and rolling out developer-facing standards and paved roads for resiliency, observability, error handling, and operational readiness

Define, standardize, and evolve logging, monitoring, alerting, and observability practices that improve signal quality, reduce noise, and enable faster diagnosis and recovery

Partner closely with Principal Software Engineers, Solution Architects, and Engineering Managers to identify systemic risks and translate them into well-scoped platform and resiliency initiatives and technical work

Operate across software engineering resiliency, data engineering resiliency, and platform engineering teams to identify cross-cutting risks, design shared solutions, and raise the technical bar, rather than owning individual team backlogs

Engage directly in application codebases, particularly during ramp-up, to understand real-world system behavior, identify failure patterns, and validate resiliency improvements. Exit application-level work once learning is complete and systemic improvements are identified

Participate in incident postmortems and operational reviews to identify recurring patterns and ensure lessons learned are translated into durable platform or resiliency improvements, not one-off fixes

Evaluate, prototype, and introduce tools and technologies that measurably improve developer productivity, platform safety, and application resiliency, prioritizing adoption, simplicity, and long-term impact

Apply AI-assisted development, diagnostics, and operational tools where they demonstrably improve engineering productivity, root cause analysis, signal quality, or resiliency outcomes

Influence engineering practices and technical direction through design reviews, reference implementations, mentorship, and technical leadership rather than formal authority or delivery ownership

Qualification

Java-based servicesCloud-native workloadsAWS public cloudApplication monitoringAI-assisted toolsSite Reliability EngineeringSoft skills

Required

Bachelor's degree in computer science, or a related technical field, or equivalent practical experience building and operating production systems

Typically, 8–10+ years of hands-on experience designing, building, and supporting large-scale software systems in production environments

Hands-on experience designing, building, and operating Java-based services, including Spring Boot applications running in virtualized and containerized environments

Experience developing and supporting cloud-native and serverless workloads, including Python-based services and event-driven architectures

Strong practical experience working in AWS public cloud environments, with an understanding of how cloud-managed services influence reliability, scalability, and operational behavior

Working knowledge of relational and non-relational data stores, including how data persistence, availability, and failure characteristics impact system design and resiliency

Experience using application monitoring and observability platforms to understand system behavior in production, such as application performance monitoring, centralized logging, and cloud-native telemetry tools (for example, AppDynamics, OpenSearch, Amazon CloudWatch, or similar)

Comfortable diagnosing complex production issues by interpreting metrics, logs, traces, and runtime signals rather than relying solely on reactive incident handling

Solid understanding of Site Reliability Engineering (SRE) principles, with the judgment to apply them selectively to guide platform and resiliency improvements rather than adopting SRE practices as a one-size-fits-all operating model

Demonstrated ability to choose between software design changes, platform capabilities, or developer enablement as the most effective way to improve reliability and operability

Hands-on experience designing and delivering one or more platform-level capabilities such as shared libraries, frameworks, internal tooling, or enablement platforms used by multiple application teams

Experience creating and rolling out paved roads, guardrails, or standardized patterns that balance safety, usability, and developer autonomy

Experience using AI-assisted tools (such as code assistants, log/trace analysis, or incident analysis tools) to improve engineering effectiveness or system reliability

Proven ability to influence technical direction and engineering practices across teams without direct ownership of delivery backlogs

Successful candidates for this role consistently demonstrate strength in the following Korn Ferry competencies: Manages Complexity – Navigates complex technical environments, synthesizes information across systems, and identifies systemic root causes. Decision Quality – Makes sound technical decisions under constraints, balancing immediate needs with long-term platform health. Drives Results – Delivers durable improvements in platform resiliency, stability, and developer effectiveness

Preferred

Cloud or technology certifications (such as AWS certifications or equivalent) are a plus and demonstrate commitment to building and operating reliable systems at scale

Benefits

Competitive compensation and benefits, including medical, dental, and vision coverage

Leave and paid time-off for holidays, vacation, personal, family, volunteer, sick, jury duty, bereavement, military, and religious observance

Financial benefits for retirement and health savings

Employee recognition programs

Discounts at Choice hotels worldwide

Company

Choice Hotels International

Glassdoor3.9

Choice Hotels International is a hospitality franchisor that provides businesses and travelers with a range of lodging options.

Founded in 1941

Rockville, Maryland, USA

1001-5000 employees

http://www.choicehotels.com

Funding

Current Stage

Public Company

Total Funding

$600M

2024-06-25Post Ipo Debt· $600M

1996-10-16IPO

Leadership Team

Judd Wadholm

Senior Vice President and General Manager, Core Brands

Noha Abdalla

Chief Marketing Officer

Recent News

Marketing Dive

Choice Hotels hones in on value in latest global marketing campaign

2026-01-22

PR Newswire

Choice Hotels International to Report Fourth Quarter and Full-Year 2025 Earnings on February 19, 2026

2026-01-16

Hotel Management Network

Choice Hotels to open six new Ascend Collection properties in Canada

2026-01-07

Company data provided by crunchbase