Apply on Employer Site

Tencent · 2 months ago

Research Scientist – Speech and Audio Understanding (Large Models & Multimodal Systems)

Bellevue, WA

Full-time

Onsite

Senior Level

$122K/yr - $230K/yr

Tencent is a leading technology company focused on innovation and development. They are seeking a Research Scientist to join their core research team, focusing on large-scale multimodal model systems that support speech and audio understanding, as well as advancing research in speech representation and multimodal applications.

AdvertisingInternetOnline GamesOnline PortalsSocial Media Marketing

Growth Opportunities

H1B Sponsor Likely

Responsibilities

Develop general-purpose, end-to-end large speech models covering multilingual automatic speech recognition (ASR), speech translation, speech synthesis, paralinguistic understanding, and general audio understanding

Advance research on speech representation learning and encoder/decoder architectures to build unified acoustic representations for multi-task and multimodal applications

Explore representation alignment and fusion mechanisms between audio/speech and other modalities in large multimodal models, enabling joint modeling with image and text

Build and maintain high-quality multimodal speech datasets, including automatic annotation and data synthesis technologies

Qualification

Speech signal processingLarge model architecturesSpeech system developmentDeep learning frameworksTransformer-based architecturesMultimodal alignmentResearch experienceMultilingual systems

Required

Ph.D. in Computer Science, Electrical Engineering, Artificial Intelligence, Linguistics, or a related field; or Master's degree with several years of relevant experience

Solid understanding of speech and audio signal processing, acoustic modeling, language modeling, and large model architectures

Proficient in one or more core speech system development pipelines such as ASR, TTS, or speech translation; experience with multilingual, multitask, or end-to-end systems is a plus

Familiar with Transformer-based architectures and their applications in speech and multimodal training/inference

Preferred

Candidates with in-depth research or practical experience in the following areas are strongly preferred:

Speech representation pretraining (e.g., HuBERT, Wav2Vec, Whisper)

Multimodal alignment and cross-modal modeling (e.g., audio-visual-text)

Experience driving state-of-the-art (SOTA) performance on audio understanding tasks with large models

Proficient in deep learning frameworks such as PyTorch or TensorFlow; experience with large-scale training and distributed systems is a plus

Benefits

Sign on payment

Relocation package

Restricted stock units

Medical, dental, vision, life and disability benefits

Participation in the Company’s 401(k) plan

Up to 15 to 25 days of vacation per year

Up to 13 days of holidays throughout the calendar year

Up to 10 days of paid sick leave per year

Company

Tencent

Glassdoor4.0

Tencent is an internet service portal offering value-added internet, mobile, telecom, and online advertising services.

Founded in 1998

Shenzhen, Guangdong, CHN

10001+ employees

https://www.tencent.com/

H1B Sponsorship

Tencent has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (3)

2024 (11)

2023 (2)

2022 (2)

Funding

Current Stage

Public Company

Total Funding

$13.84B

Key Investors

Lippo Group

2025-09-16Post Ipo Debt· $1.27B

2020-05-29Post Ipo Debt· $6B

2019-08-29Post Ipo Debt· $6.5B

Leadership Team

Dowson Tong

CEO of Tencent Cloud and Smart Industries Group (CSIG)

James Mitchell

Chief Strategy Officer and Senior Executive Vice President

Recent News

Law.asia

Firms steer AI developer MiniMax’s HKD4.82bn Hong Kong IPO

2026-01-16

PC Gamer

It's January 13 and Ubisoft just announced its second round of layoffs for 2026, at Massive Entertainment and Ubisoft Stockholm

2026-01-16

GlobeNewswire

WeRide Makes Robotaxi Booking Effortless via Tencent's Super-app WeChat in China

2026-01-16

Company data provided by crunchbase