ByteDance · 6 hours ago
Research Scientist Graduate (Foundation Model, Vision and Language) - 2026 Start (PhD)
ByteDance is a pioneering company in advanced AI foundation models, and they are looking for talented graduates to join their Doubao-Vision team. The role involves conducting cutting-edge research in computer vision and natural language processing, focusing on multi-modality and enhancing model performance through innovative methodologies.
ContentData MiningFoundational AIInternetSocial Media
Responsibilities
Conduct cutting-edge research and development in computer vision and natural language processing, especially in the areas of multi-modality, vision and language, etc
Enhance multimodal understanding and reasoning (images and videos etc), throughout the entire development process, encompassing data acquisition, model evaluation, pre-training, SFT, reward modeling, and reinforcement learning, to bolster overall performance
Synthesize large-scale, high-quality multi-modal data through methods such as rewriting, augmentation, and generation to improve the abilities of foundation models in various stages (pretraining, SFT, RLHF)
Investigate and implement robust evaluation methodologies to assess model performance at various stages (ranging from covering diverse multimodal skills to improving user preference alignment), unravel the underlying mechanisms and sources of their abilities, and utilize this understanding to drive model improvements
Qualification
Required
Research and engineering experience in one or more areas of computer vision and natural language processing, including but not limited to: Experience in multi-modal understanding, vision and language, such as multimodal pre-training, visual instruction tuning, alignment learning, and other related topics
Work with very large-scale datasets, and build very large-scale datasets to scale up foundation models
Experience with language models and apply them in various downstream tasks
Highly competent in algorithms and programming; Strong coding skills in Python and popular deep learning frameworks
Work and collaborate well with team members
Ability to work independently; Strong communication skills
Preferred
Candidates with publications in venues such as CVPR, ECCV, ICCV, NeurIPS, ICLR, ICML, EMNLP, ACL, NAACL, etc
Candidates with impactful open-source projects on GitHub and a demonstrated engineering ability to quickly solve new challenges
Benefits
Employees have day one access to medical, dental, and vision insurance
A 401(k) savings plan with company match
Paid parental leave
Short-term and long-term disability coverage
Life insurance
Wellbeing benefits
10 paid holidays per year
10 paid sick days per year
17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)
Company
ByteDance
ByteDance is a technology company that develops content creation platforms and services.
H1B Sponsorship
ByteDance has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1350)
2024 (1123)
2023 (775)
2022 (487)
2021 (417)
2020 (245)
Funding
Current Stage
Late StageTotal Funding
$9.8BKey Investors
Capital TodayG42Tiger Global Management
2025-11-20Secondary Market· $300M
2024-07-25Secondary Market
2023-03-14Secondary Market· $100M
Leadership Team
Recent News
2026-01-08
Company data provided by crunchbase