Meta · 3 days ago
Research Scientist, AI & Systems Co-design (PhD)
Meta is a technology company that builds platforms to connect people and empower communities. They are seeking a Research Scientist to explore and optimize AI and systems co-design, focusing on improving performance and efficiency in large-scale training and inference systems. The role involves collaborating with various teams to innovate and prototype solutions that can be productionized for Meta's AI workloads.
Computer Software
Responsibilities
Explore, co-design and optimize parallelisms, compute efficiency, distributed training/inference paradigms and algorithms to improve the scalability, efficiency and reliability of inference and large-scale training systems
Innovate and co-design novel model architectures for sustained scaling and hardware efficiency during training and inference
Benchmark, analyze, model and project the performance of AI workloads against a wide range of what-if scenarios and provide early input to the design of future hardware, models and runtime, giving crucial feedback to the architecture, compiler, kernel, modeling and runtime teams
Explore, co-design and productionize model compression techniques such as Quantization, Pruning, Distillation and Sparsity to improve training and inference efficiency
Explore, prototype and productionize highly optimized ML kernels to unlock full potential of current and future accelerators for Meta’s AI workloads. Open source SOTA implementations as applicable
Optimize inference and training communications performance at scale and investigate improvements to algorithms, tooling, and interfaces, working across multiple accelerator types and HPC collective communication libraries such as NCCL, RCCL, UCC and MPI
Guide Meta’s AI HW requirements and design focusing on performance at System and Silicon levels. Co-design and optimize our AI HW and related software stack for Meta’s future workloads, with technology pathfinding and evaluation of cutting-edge, including off-market hardware systems, spanning multi-vendor/generation GPUs and ASICs, including Meta’s in-house MTIA
Qualification
Required
Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Computer Vision, Generative AI, NLP, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
Specialized experience in one or more of the following areas: Accelerators/GPU architectures, High Performance Computing (HPC), Machine Learning Compilers, Training/Inference ML Systems, Model Compression, Communication Collectives, ML Kernels/Operator optimizations, Machine learning frameworks (e.g. PyTorch) and SW/HW co-design
Experience developing AI-System infrastructure or AI algorithms in C/C++ or Python
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Preferred
Experience or knowledge of training/inference of large scale deep learning models
Experience or knowledge of either Generative AI models such as LLMs/LDMs or Ranking & Recommendation models such as DLRM or equivalent
Experience or knowledge of distributed ML systems and algorithm development
Experience or knowledge of at least one of the responsibilities listed in this job posting
Benefits
Bonus
Equity
Benefits
Company
Meta
Meta's mission is to build the future of human connection and the technology that makes it possible.
Funding
Current Stage
Late StageRecent News
Crunchbase News
2025-11-17
2025-11-16
Company data provided by crunchbase