SIGN IN
Lead Site Reliability Engineer, AI/ML Platform jobs in United States
cer-icon
Apply on Employer Site
company-logo

Chase · 8 hours ago

Lead Site Reliability Engineer, AI/ML Platform

Chase, one of the oldest financial institutions, is seeking a Lead Site Reliability Engineer for their AI/ML Platform. The role involves designing and implementing solutions to enhance the reliability and scalability of AI/ML platforms, partnering with product engineering teams, and providing strategic technology leadership.
BankingFinancial Services

Responsibilities

Design and implement solutions to enhance the reliability and scalability of AI/ML platforms and applications to accommodate fast growing demands
Partner with product engineering teams to ensure the AI/ML systems are reliable and high performing
Develop observability, security, automation and fin-ops tools and orchestration
Provide strategic technology leadership by defining and evaluating standards and architecture for reliability, observability and automation frameworks
Build strong cross-functional relationships that foster engagements across the organization and deliver solutions to user problems
Debug and solve issues in a production environment, identify root cause and remediate
Participates in on-call rotations, incident management and escalation workflows
Take full ownership of problems, develop solutions, and acquire new knowledge to complete the task
Mentor and guide junior engineers

Qualification

SRE principlesCloud platformsAdvanced observability toolsDistributed systems architectureProblem-solving skillsSelf-managedUrgencyAI/ML experienceMulti-cloud experienceCommunicationSelf-motivatedOwnership

Required

Bachelor's degree in computer science, Information Technology, or equivalent technical qualification with 5+ years professional experience
Expertise in SRE principles, reliability, scalability and performance of application and infrastructure
Have hands-on experience with cloud platforms (AWS, GCP, Azure) and IaC tools (Terraform, Ansible)
Extensive experience implementing advanced observability using tools like Open Telemetry, Dynatrace, Grafana, and/or cloud-native services
Experience in architecting distributed systems and cloud-native architecture in AWS
Systematic problem-solving and troubleshooting skills in a complex system
Excellent communication skills and ability to represent and present business and technical concepts to stakeholders
Self-managed, self-motivated with strong sense of ownership, urgency, and drive

Preferred

Prior experience working in AI, ML, or Data engineering
Prior experience developing AI Ops/AI Agents
Multi cloud experience (AWS, GCP, Azure) is a plus

Benefits

Comprehensive health care coverage
On-site health and wellness centers
A retirement savings plan
Backup childcare
Tuition reimbursement
Mental health support
Financial coaching

Company

Chase provides broad range of financial services. It is a sub-organization of JP Morgan Chase.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Mike McDonnell
Managing Director, Head of Chase Travel Platform Product
linkedin
leader-logo
Nicole Sanchez
Managing Director, Consumer Bank, GM and Product Executive, Growth Financial Products
linkedin
Company data provided by crunchbase