SIGN IN
Distributed Training & Performance Engineer - Executive Director jobs in United States
cer-icon
Apply on Employer Site
company-logo

JPMorganChase · 1 day ago

Distributed Training & Performance Engineer - Executive Director

JPMorganChase is a leading financial institution seeking a senior-level engineer to join their Global Technology Applied Research (GTAR) center. The role involves designing, optimizing, and scaling large-model pretraining workloads across hyperscale accelerator clusters, with a focus on improving training throughput and efficiency.
Asset ManagementBankingFinancial Services
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Design and optimize distributed training strategies for large-scale models, including data, tensor, pipeline, context parallelism
Manage end-to-end training performance: from data input pipelines through model execution, communication, and checkpointing
Identify and eliminate performance bottlenecks using systematic profiling and performance modeling
Develop or optimize high-performance kernels using CUDA, Triton, or equivalent frameworks
Design and optimize distributed communication strategies to maximize overlap between computation and inter-node data movement
Design memory-efficient training configurations (caching, optimizer sharding, checkpoint strategies)
Evaluate and optimize training on multiple accelerator platforms, including GPUs and non-GPU accelerators
Contribute towards incorporating performance improvements back to internal pipelines

Qualification

Distributed training strategiesLarge-scale model trainingGPU programmingPerformance optimizationPython for ML systemsC++ for performance-criticalCollective communication librariesModern ML frameworksCollaboration with hardware vendorsOpen-source contributionsScaling laws understanding

Required

Master's degree with 5+ years of industry experiences, or Ph.D. degree with 3+ years of industry experience in computer science, physics, math, engineering or related fields
Engineering experience at top AI labs, HPC centers, chip vendors, or hyperscale ML infra teams
Strong experience designing and operating large-scale distributed training jobs across multinode accelerator clusters
Deep understanding of distributed parallelism strategies: data parallelism, tensor/model parallelism, pipeline parallelism, and memory/optimizer sharding
Proven ability to profile and optimize training performance using industry standard tools such as Nsight, PyTorch profiler, or equivalent
Hands-on experience with GPU programming and kernel optimization
Strong understanding of accelerator memory hierarchies, bandwidth limitations, and compute-communication tradeoffs
Experience with collective communication libraries and patterns (e.g., NCCL-style collectives)
Proficiency in Python for ML systems development and C++ for performance-critical components
Experience with modern ML frameworks such as PyTorch or JAX in large-scale training settings

Preferred

Experience optimizing training workloads on non-GPU accelerators (e.g., TPU, or wafer-scale architectures)
Familiarity with compiler-driven ML systems (e.g., XLA, MLIR, Inductor) and graph-level optimizations
Experience designing custom fused kernels or novel execution strategies for attention or large matrix operations
Strong understanding of scaling laws governing large-model pretraining dynamics and stability considerations
Contributions to open-source ML systems, distributed training frameworks, or performance-critical kernels
Prior experience collaborating directly with hardware vendors or accelerator teams

Benefits

Comprehensive health care coverage
On-site health and wellness centers
A retirement savings plan
Backup childcare
Tuition reimbursement
Mental health support
Financial coaching

Company

JPMorganChase

company-logo
With a history tracing its roots to 1799 in New York City, JPMorganChase is one of the world's oldest, largest, and best-known financial institutions—carrying forth the innovative spirit of our heritage firms in global operations across 100 markets.

H1B Sponsorship

JPMorganChase has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (3471)
2024 (3469)
2023 (3395)
2022 (3594)
2021 (2515)
2020 (2495)

Funding

Current Stage
Public Company
Total Funding
unknown
1998-02-01IPO

Leadership Team

leader-logo
Allison Beer
CEO of Card Services and Connected Commerce
linkedin
leader-logo
Dan Mendelson
CEO, Morgan Health
linkedin
Company data provided by crunchbase