AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Compute jobs in United States
cer-icon
Apply on Employer Site
company-logo

Apple · 3 months ago

AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Compute

Apple is where individual imaginations gather together, committing to the values that lead to great work. As an engineer on the ML Compute team, your work will involve driving large-scale pre-training initiatives and enhancing distributed training techniques to support cutting-edge foundation models.

AppsArtificial Intelligence (AI)BroadcastingDigital EntertainmentFoundational AIMedia and EntertainmentMobile DevicesOperating SystemsTVWearables
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Drive large-scale pre-training initiatives to support cutting-edge foundation models, focusing on resiliency, efficiency, scalability, and resource optimization
Enhance distributed training techniques for foundation models
Research and implement new patterns and technologies to improve system performance, maintainability, and design
Optimize execution and performance of workloads built with JAX, PyTorch, XLA and CUDA on large distributed systems
Leverage high-performance networking technologies such as NCCL for GPU collectives and TPU interconnect (ICI/Fabric) for large-scale distributed training
Architect a robust MLOps platform to streamline and automate pretraining operations
Operationalize large-scale ML workloads on Kubernetes, ensuring distributed trainings are robust, efficient, and fault-tolerant
Lead complex technical projects, defining requirements and tracking progress with team members
Collaborate with cross-functional engineers to solve large-scale ML training challenges
Mentor engineers in areas of your expertise, fostering skill growth and knowledge sharing
Cultivate a team centered on collaboration, technical excellence, and innovation

Qualification

Distributed systemsMachine learning modelsKubernetesPythonCloud computingJAXPyTorchGPU debuggingCollaborationMentoring

Required

Bachelors in Computer Science, engineering, or a related field
6+ years of hands-on experience in building scalable backend systems for training and evaluation of machine learning models
Proficient in relevant programming languages, like Python or Go
Strong expertise in distributed systems, reliability and scalability, containerization, and cloud platforms
Proficient in cloud computing infrastructure and tools: Kubernetes, Ray, PySpark
Ability to clearly and concisely communicate technical and architectural problems, while working with partners to iteratively find

Preferred

Advance degrees in Computer Science, engineering, or a related field
Proficient in working with and debugging accelerators, like: GPU, TPU, AWS Trainium
Proficient in ML training and deployment frameworks, like: JAX, Tensorflow, PyTorch, TensorRT, vLLM

Benefits

Comprehensive medical and dental coverage
Retirement benefits
A range of discounted products and free services
Reimbursement for certain educational expenses — including tuition
Discretionary bonuses or commission payments
Relocation

Company

Apple is a technology company that designs, manufactures, and markets consumer electronics, personal computers, and software.

H1B Sponsorship

Apple has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6998)
2024 (3766)
2023 (3939)
2022 (4822)
2021 (4060)
2020 (3656)

Funding

Current Stage
Public Company
Total Funding
$5.67B
Key Investors
Berkshire HathawayMicrosoftSequoia Capital
2025-05-05Post Ipo Debt· $4.5B
2025-01-16Post Ipo Debt· $0.31M
2021-04-30Post Ipo Equity

Leadership Team

leader-logo
Tim Cook
CEO
leader-logo
Craig Federighi
SVP, Software Engineering
Company data provided by crunchbase