Lead Site Reliability Engineer, AI/ML Platform jobs in United States
cer-icon
Apply on Employer Site
company-logo

Chase · 3 weeks ago

Lead Site Reliability Engineer, AI/ML Platform

Chase is seeking a Lead Site Reliability Engineer for their AI/ML Platform. The role involves designing and implementing solutions to enhance the reliability and scalability of AI/ML platforms, as well as providing strategic technology leadership and mentoring junior engineers.

BankingFinancial Services

Responsibilities

Design and implement solutions to enhance the reliability and scalability of AI/ML platforms and applications to accommodate fast growing demands
Partner with product engineering teams to ensure the AI/ML systems are reliable and high performing
Develop observability, security, automation and fin-ops tools and orchestration
Provide strategic technology leadership by defining and evaluating standards and architecture for reliability, observability and automation frameworks
Build strong cross-functional relationships that foster engagements across the organization and deliver solutions to user problems
Debug and solve issues in a production environment, identify root cause and remediate
Participates in on-call rotations, incident management and escalation workflows
Take full ownership of problems, develop solutions, and acquire new knowledge to complete the task
Mentor and guide junior engineers

Qualification

SRE principlesCloud platformsObservability toolsDistributed systems architectureIaC toolsProblem-solving skillsCommunication skillsSelf-motivated

Required

Bachelor's degree in computer science, Information Technology, or equivalent technical qualification with 5+ years professional experience
Expertise in SRE principles, reliability, scalability and performance of application and infrastructure
Have hands-on experience with cloud platforms (AWS, GCP, Azure) and IaC tools (Terraform, Ansible)
Extensive experience implementing advanced observability using tools like Open Telemetry, Dynatrace, Grafana, and/or cloud-native services
Experience in architecting distributed systems and cloud-native architecture in AWS
Systematic problem-solving and troubleshooting skills in a complex system
Excellent communication skills and ability to represent and present business and technical concepts to stakeholders
Self-managed, self-motivated with strong sense of ownership, urgency, and drive

Preferred

Prior experience working in AI, ML, or Data engineering
Prior experience developing AI Ops/AI Agents
Multi cloud experience (AWS, GCP, Azure) is a plus

Company

Chase provides broad range of financial services. It is a sub-organization of JP Morgan Chase.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Mike McDonnell
Managing Director, Head of Chase Travel Platform Product
linkedin
leader-logo
Nicole Sanchez
Managing Director, Consumer Bank, GM and Product Executive, Growth Financial Products
linkedin
Company data provided by crunchbase