Research Engineer - Evaluations jobs in United States
cer-icon
Apply on Employer Site
company-logo

Luma AI · 9 hours ago

Research Engineer - Evaluations

Luma AI is focused on building multimodal AI to enhance human capabilities and imagination. They are looking for a Research Engineer to develop and scale the infrastructure for model evaluations, ensuring rigorous and consistent measurement of model performance to inform development efforts.

Artificial Intelligence (AI)Generative AIVideoVideo Editing
check
H1B Sponsor Likelynote

Responsibilities

Design and implement scalable pipelines for automated evaluation of generative models, with a focus on visual and multimodal outputs (image, video, text, audio)
Develop novel metrics and evaluation models that capture qualities like fidelity, coherence, temporal consistency, and alignment with human intent
Integrate evaluation signals into training loops (including reinforcement learning and reward modeling) to continuously improve model performance
Build infrastructure for large-scale regression testing, benchmarking, and monitoring of multimodal generative models
Collaborate with researchers running human studies to translate human evaluation frameworks into automated or semi-automated systems
Partner with model researchers to identify failure cases and build targeted evaluation harnesses
Maintain dashboards, reporting tools, and alerting systems to surface evaluation results to stakeholders
Stay current with emerging evaluation techniques in generative AI, multimodal LLMs, and perceptual quality assessment

Qualification

Machine LearningML evaluation systemsPythonGenerative modelsVisual data handlingML frameworksSoftware engineeringHuman-in-the-loop workflowsReinforcement learningPerceptual metricsCreative media workflows

Required

Master's or PhD in Computer Science, Machine Learning, or a related technical field (or equivalent industry experience)
3+ years of experience building ML evaluation systems, model pipelines, or large-scale infrastructure
Hands-on experience working with visual data (images and/or video), including evaluation, modeling, or data preparation
Proficiency in Python and ML frameworks (PyTorch, JAX, or TensorFlow)
Familiarity with human-in-the-loop evaluation workflows and how to scale them with automation
Strong background in machine learning, with experience in generative models (diffusion, LLMs, multimodal architectures)
Strong software engineering skills (CI/CD, testing, data pipelines, distributed systems)

Preferred

Experience with reinforcement learning or reward modeling
Prior work on perceptual metrics, multimodal evaluation benchmarks, or retrieval-based evaluation
Background in large-scale model training or evaluation infrastructure
Experience designing metrics for perceptual quality
Familiarity with creative media workflows (film, VFX, animation, digital art)
Contributions to open-source evaluation libraries or benchmarks

Company

Luma AI

twittertwittertwitter
company-logo
Luma AI develops tools that let users generate photorealistic images and videos from text, image, or video prompts.

H1B Sponsorship

Luma AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (10)
2024 (3)

Funding

Current Stage
Growth Stage
Total Funding
$1.06B
Key Investors
HUMAINAndreessen HorowitzAmplify Partners
2025-11-19Series C· $900M
2024-12-06Series B· $90M
2024-01-09Series B· $43M

Leadership Team

leader-logo
Amit Jain
Co-Founder
linkedin
Company data provided by crunchbase