Apply on Employer Site

Luma AI · 4 hours ago

Research Scientist / Engineer – Multimodal Capabilities

United States

Full-time

Remote

Senior Level

Luma AI is dedicated to building multimodal AI to enhance human capabilities. The role involves conducting pioneering research to define the future capabilities of multimodal models, designing experiments, and collaborating with research teams to translate findings into product experiences.

Artificial Intelligence (AI)Generative AIVideoVideo Editing

H1B Sponsor Likely

Responsibilities

Research and Define the next frontier of multimodal capabilities, identifying key gaps in our current models and designing the experiments to solve them

Design and Execute novel experiments, datasets, and methodologies to systematically improve model performance across vision, audio, and language

Develop and Pioneer new evaluation frameworks and benchmarking approaches to precisely measure novel multimodal behaviors and capabilities

Collaborate Deeply with other research teams to translate your findings into our core training recipes and unlock new product experiences

Build and Prototype compelling demonstrations that showcase the groundbreaking multimodal capabilities you have unlocked

Qualification

PythonPyTorchMultimodal data pipelinesComputer visionNatural language processingAudio processingResearch experiencePublication recordCollaborationProblem-solving

Required

You have a PhD or equivalent research experience in a field related to AI, Machine Learning, or Computer Science

You have strong programming skills in Python and deep, hands-on experience with PyTorch

You have a proven track record of working with multimodal data pipelines and curating large-scale datasets for research

You possess a deep, fundamental understanding of at least one of the core modalities: computer vision, audio processing, or natural language processing

You thrive on tackling the most ambitious, open-ended research challenges in a fast-paced, collaborative environment

Preferred

Direct expertise working with complex, interleaved multimodal data (video, audio, text)

Hands-on experience training or fine-tuning Vision Language Models (VLMs), Audio Language Models, or large-scale generative video models from scratch

A strong publication record in top-tier AI conferences (e.g., NeurIPS, ICML, CVPR, ICLR)

Experience leading ambitious, open-ended research projects from ideation to tangible results

Company

Luma AI

Luma AI develops tools that let users generate photorealistic images and videos from text, image, or video prompts.

Founded in 2021

Palo Alto, California, USA

11-50 employees

https://lumalabs.ai

H1B Sponsorship

Luma AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (10)

2024 (3)

Funding

Current Stage

Growth Stage

Total Funding

$1.06B

Key Investors

HUMAINAndreessen HorowitzAmplify Partners

2025-11-19Series C· $900M

2024-12-06Series B· $90M

2024-01-09Series B· $43M

Leadership Team

Amit Jain

Co-Founder

Recent News

Business Wire

Luma AI Announces Ray3 Modify, a New Model for Hybrid-AI Workflows for Acting & Performances, Now Available in Dream Machine

2025-12-18

Crunchbase News

Jeff Bezos’s Project Prometheus Joins The Unicorn Board Alongside 18 Other Startups In November

2025-12-10

Pulse 2.0

Luma AI: International Expansion With New London Office

2025-12-07

Company data provided by crunchbase