Student Researcher [Seed Vision – Multimodal Video Generation] – 2026 Start (PhD) jobs in United States
cer-icon
Apply on Employer Site
company-logo

ByteDance · 7 hours ago

Student Researcher [Seed Vision – Multimodal Video Generation] – 2026 Start (PhD)

ByteDance is a pioneering company dedicated to advanced AI foundation models. The role involves conducting research on multimodal video generation and collaborating with researchers and engineers to enhance generative models for visual content.

ContentData MiningInternetSocial Media
check
Comp. & Benefits
check
H1B Sponsor Likelynote

Responsibilities

Conduct research on multimodal video generation, with a focus on improving semantic alignment between inputs and generated content
Integrate vision-language models (e.g., CLIP, pre/post-trained VLMs) into video generation architectures to enhance input understanding
Explore and implement joint training or fine-tuning approaches that couple VLMs with video generation backbones
Evaluate model performance on tasks requiring high-level reasoning or detailed semantic control over generation
Collaborate with researchers and engineers to iterate on prototypes within an existing infrastructure

Qualification

Computer VisionMachine LearningVision-language modelsVideo generationPyTorchResearch abilityClean codingCollaboration

Required

Currently pursuing a PhD in Computer Vision, Machine Learning, or a related field
Research experience in one or more of the following areas: Vision-language models (VLMs); Multimodal or joint model training; Video generation
Solid coding ability and clean research implementation style, and expected to work with a production-grade codebase (e.g., PyTorch)
Demonstrated research ability, with first-author publications in top-tier ML/CV/AI conferences such as CVPR, ICCV, ECCV, and ICLR

Preferred

Experience in training or fine-tuning autoregressive or diffusion-based video generation models
Background in multimodal instruction-following, alignment, or conditioning for generation tasks
Understanding of evaluation techniques for assessing semantic consistency in generated video

Benefits

Health insurance
Life insurance
Wellbeing benefits
10 paid holidays per year
Paid sick time (56 hours if hired in first half of year, 40 if hired in second half of year)
Housing allowance

Company

ByteDance

company-logo
ByteDance is a technology company that develops content creation platforms and services.

H1B Sponsorship

ByteDance has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1161)
2024 (1123)
2023 (775)
2022 (487)
2021 (417)
2020 (245)

Funding

Current Stage
Late Stage
Total Funding
$9.8B
Key Investors
Capital TodayG42Tiger Global Management
2025-11-20Secondary Market· $300M
2024-07-25Secondary Market
2023-03-14Secondary Market· $100M

Leadership Team

leader-logo
Jochen Bischoff
Head of Global Business Solutions - Africa
linkedin
leader-logo
Matty Lin
Managing Director, Monetization and Partnerships
linkedin
Company data provided by crunchbase