Meta · 1 week ago
Research Scientist Intern, Vision-Language and Embodied AI (PhD)
Meta is seeking a Research Scientist Intern in the Reality Labs Research team to develop next-generation assistance systems in contextual environments. The role involves executing research on embodied AI, vision-language models, and collaborating with engineers to create scalable solutions.
Computer Software
Responsibilities
Plan and execute cutting-edge research on embodied AI algorithms, assistance policies, vision-language models, and world model learning for complex, real-world interaction tasks
Develop, implement, and evaluate methods for improving the performance and interpretability of VLMs and related AI/ML models
Leverage state-of-the-art simulators, RL/DRL, neuro-symbolic, AI planning, robotics, stochastic programming, and multimodal learning methods
Write modular, reusable research code and utilize Meta’s large infrastructure to scale experimentation
Collaborate cross-functionally with researchers and engineers to prototype and test models at scale
Deliver clear, compelling, and creative solutions to challenging problems
Work should result in publishable research in top-tier journals or conferences (e.g., NeurIPS, ICLR, CVPR, ECCV, ICML, ICCV, AAAI, IJCAI, ICRA, IEEE T-PAMI, IJCV, IEEE RA-L etc.)
Qualification
Required
Currently has, or is in the process of obtaining, a PhD in Machine Learning, Artificial Intelligence, Computer Vision, Robotics, Speech Processing, Applied Statistics, Computational Neuroscience, Algorithms, Computational Mathematics, or a related field
Proven research skills: problem definition, solution exploration, analysis, and presentation of results
2+ years of experience in Python and machine learning libraries (Numpy, Scikit-Learn, Scipy, Pandas, Matplotlib, Tensorflow, Pytorch)
Understanding of at least one of the following: embodied AI, reinforcement learning, planning, transfer/few-shot/zero-shot/continual/online learning, self-supervised learning, multi/cross-modal learning, vision-language models, LLM interpretability, world model learning, hand pose estimation, or object pose estimation
Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
Preferred
Proven track record of significant results: grants, fellowships, patents, and first-authored publications at leading workshops or conferences (e.g., NeurIPS, ICLR, CVPR, ECCV, ICML, ICCV, AAAI, IJCAI, ICRA, IEEE T-PAMI, IJCV, IEEE RA-L etc.)
Experience with VLM/LLM training/fine-tuning and solving traditional CV problems (e.g., hand/body pose estimation, object pose estimation, image classification/segmentation, image/video understanding, 3D scene reconstruction)
Experience working and communicating cross-functionally in a team environment
Intent to return to the degree program after the completion of the internship/co-op
Availability for minimum 16 consecutive week internship
Benefits
In addition to base compensation, Meta offers benefits.
Learn more about benefits at Meta.
Company
Meta
Meta's mission is to build the future of human connection and the technology that makes it possible.
Funding
Current Stage
Late StageRecent News
Crunchbase News
2025-11-17
2025-11-16
Company data provided by crunchbase