Apply on Employer Site

JPMorganChase · 1 day ago

Distributed Training & Performance Engineer - Vice President

New York, NY

Full-time

Onsite

Entry Level

$164K/yr - $260K/yr

1+ years exp

JPMorganChase is a leading financial institution that offers innovative financial solutions. They are seeking a senior-level engineer to design, optimize, and scale large-model pretraining workloads across hyperscale accelerator clusters, focusing on distributed systems and performance engineering.

Asset ManagementBankingFinancial Services

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Design and optimize distributed training strategies for large-scale models, including data, tensor, pipeline, context parallelism

Manage end-to-end training performance: from data input pipelines through model execution, communication, and checkpointing

Identify and eliminate performance bottlenecks using systematic profiling and performance modeling

Develop or optimize high-performance kernels using CUDA, Triton, or equivalent frameworks

Design and optimize distributed communication strategies to maximize overlap between computation and inter-node data movement

Design memory-efficient training configurations (caching, optimizer sharding, checkpoint strategies)

Evaluate and optimize training on multiple accelerator platforms, including GPUs and non-GPU accelerators

Contribute towards incorporating performance improvements back to internal pipelines

Qualification

Distributed training strategiesLarge-scale model trainingCUDA programmingPerformance optimizationPython for ML systemsC++ for performance-criticalCollective communication librariesSoft skills

Required

Master's degree with 3+ years of industry experiences, or Ph.D. degree with 1+ years of industry experience in computer science, physics, math, engineering or related fields

Engineering experience at top AI labs, HPC centers, chip vendors, or hyperscale ML infra teams

Strong experience designing and operating large-scale distributed training jobs across multinode accelerator clusters

Deep understanding of distributed parallelism strategies: data parallelism, tensor/model parallelism, pipeline parallelism, and memory/optimizer sharding

Proven ability to profile and optimize training performance using industry standard tools such as Nsight, PyTorch profiler, or equivalent

Hands-on experience with GPU programming and kernel optimization

Strong understanding of accelerator memory hierarchies, bandwidth limitations, and compute-communication tradeoffs

Experience with collective communication libraries and patterns (e.g., NCCL-style collectives)

Proficiency in Python for ML systems development and C++ for performance-critical components

Experience with modern ML frameworks such as PyTorch or JAX in large-scale training settings

Preferred

Experience optimizing training workloads on non-GPU accelerators (e.g., TPU, or wafer-scale architectures)

Familiarity with compiler-driven ML systems (e.g., XLA, MLIR, Inductor) and graph-level optimizations

Experience designing custom fused kernels or novel execution strategies for attention or large matrix operations

Strong understanding of scaling laws governing large-model pretraining dynamics and stability considerations

Contributions to open-source ML systems, distributed training frameworks, or performance-critical kernels

Prior experience collaborating directly with hardware vendors or accelerator teams

Benefits

Comprehensive health care coverage

On-site health and wellness centers

A retirement savings plan

Backup childcare

Tuition reimbursement

Mental health support

Financial coaching

Company

JPMorganChase

Glassdoor4.0

With a history tracing its roots to 1799 in New York City, JPMorganChase is one of the world's oldest, largest, and best-known financial institutions—carrying forth the innovative spirit of our heritage firms in global operations across 100 markets.

Founded in 2000

New York, New York, USA

10001+ employees

https://www.jpmorganchase.com

H1B Sponsorship

JPMorganChase has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (3471)

2024 (3469)

2023 (3395)

2022 (3594)

2021 (2515)

2020 (2495)

Funding

Current Stage

Public Company

Total Funding

unknown

1998-02-01IPO

Leadership Team

Allison Beer

CEO of Card Services and Connected Commerce

Dan Mendelson

CEO, Morgan Health

Recent News

Tech in Asia

JP Morgan buys Sri Lankan fintech startup WealthOs

2026-01-23

Sky News

Banking behemoth JP Morgan Nutmegs rivals with purchase of WealthOS

2026-01-22

PR Newswire

Women Presidents Organization and J.P. Morgan Commercial Banking Open Nominations for the 2026 50 Fastest Growing Women-Owned/Led Companies™

2026-01-20

Company data provided by crunchbase