MBZUAI (Mohamed bin Zayed University of Artificial Intelligence) · 5 months ago
Distributed Machine Learning Engineer
Mohamed bin Zayed University of Artificial Intelligence is dedicated to research, innovation, and empowering brilliant minds in AI. The Distributed Machine Learning Engineer will optimize performance for machine learning software stacks, develop new systems, and work alongside researchers to tackle challenges in AI development.
Artificial Intelligence (AI)Higher EducationUniversities
Responsibilities
Understand, analyze, profile, optimize, and provide guidance to the team on deep learning workloads on state-of-the-art hardware and software platforms to improve their efficiency with different levels of optimization
Design and implement performance benchmarks and testing methodologies to evaluate application performance
Build tools to automate workload analysis, workload optimization, and other critical workflows
Triage system issues and identify bottleneck and inefficiencies by analyzing the sources of issues and the impact on hardware, network and propose solutions to enhance GPU utilization
Support the team to develop appropriate kernels and systems for new model architectures and algorithms
Participate in, or lead design reviews with peers and stakeholders to decide amongst available technologies
Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency)
Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback
Represent MBZUAI at industry conferences and events, showcasing the institution’s cutting-edge HPC and deep learning capabilities and establishing MBZUAI as a global leader in AI research and innovation
Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives
Qualification
Required
Ph.D. in CS, EE or CSEE with 1+ years working experience, OR
Masters in CS, EE or CSEE or equivalent experience with 2+ year working experience
Strong background in parallel computing
Hands-on experience in system level coding
Debug methodologies experience
Large-scale machine learning experience
Benefits
Comprehensive medical, dental, and vision benefits
Bonus
401K Plan
Generous paid time off, sick leave and holidays
Paid Parental Leave
Employee Assistance Program
Life insurance and disability
Company
MBZUAI (Mohamed bin Zayed University of Artificial Intelligence)
Official account of Mohamed bin Zayed University of Artificial Intelligence. Dedicated to research, innovation, and empowering brilliant minds in AI.
Funding
Current Stage
Growth StageTotal Funding
$0.04MKey Investors
Llama
2024-09-24Grant· $0.04M
Recent News
2025-12-24
Fintechnews Middle East
2025-12-18
Middle East AI News
2025-12-17
Company data provided by crunchbase