Apply on Employer Site

Deepgram · 2 weeks ago

Research Staff, Voice AI Foundations

United States

Full-time

Remote

Lead/Staff

Deepgram is the leading voice AI platform for developers building advanced speech technologies. As a member of the Research Staff, you will pioneer the development of Latent Space Models to address challenges in voice AI, focusing on creating innovative neural audio codecs and generative models that enhance human-machine interaction.

Artificial Intelligence (AI)Data Collection and LabelingDeveloper APIsNatural Language ProcessingSpeech Recognition

H1B Sponsor Likely

Responsibilities

Build next-generation neural audio codecs that achieve extreme, low bit-rate compression and high fidelity reconstruction across a world-scale corpus of general audio

Pioneer steerable generative models that can synthesize the full diversity of human speech from the codec latent representation, from casual conversation to highly emotional expression to complex multi-speaker scenarios with environmental noise and overlapping speech

Develop embedding systems that cleanly factorize the codec latent space into interpretable dimensions of speaker, content, style, environment, and channel effects -- enabling precise control over each aspect and the ability to massively amplify an existing seed dataset through 'latent recombination'

Leverage latent recombination to generate synthetic audio data at previously impossible scales, unlocking joint model and data scaling paradigms for audio

Endeavor to train multimodal speech-to-speech systems that can 1) understand any human irrespective of their demographics, state, or environment and 2) produce empathic, human-like responses that achieve conversational or task-oriented objectives

Design model architectures, training schemes, and inference algorithms that are adapted for hardware at the bare metal enabling cost efficient training on billion-hour datasets and powering real-time inference for hundreds of millions of concurrent conversations

Qualification

Statistical learning theoryFoundation model architecturesData pipeline developmentModel optimizationExperimental designNeural audio codecsGenerative modelsMultimodal learningAI-first mindsetCreative problem solvingCollaboration

Required

Strong mathematical foundation in statistical learning theory, particularly in areas relevant to self-supervised and multimodal learning

Deep expertise in foundation model architectures, with an understanding of how to scale training across multiple modalities

Proven ability to bridge theory and practice—someone who can both derive novel mathematical formulations and implement them efficiently

Demonstrated ability to build data pipelines that can process and curate massive datasets while maintaining quality and diversity

Track record of designing controlled experiments that isolate the impact of architectural innovations and validate theoretical insights

Experience optimizing models for real-world deployment, including knowledge of hardware constraints and efficiency techniques

History of open-source contributions or research publications that have advanced the state of the art in speech/language AI

Company

Deepgram

Deepgram provides a voice artificial intelligence platform for speech-to-text, text-to-speech, and voice applications.

Founded in 2015

San Francisco, California, USA

51-200 employees

https://deepgram.com

H1B Sponsorship

Deepgram has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (2)

2024 (1)

2022 (1)

Funding

Current Stage

Growth Stage

Total Funding

$229.09M

Key Investors

AVPTiger Global ManagementIQT

2026-01-13Series C· $143.17M

2022-11-26Series B· $47M

2021-02-03Series B· $25M

Leadership Team

Scott Stephenson

Co-Founder and CEO

Adam Sypniewski

Chief Technology Officer

Recent News

Motor Slash

Lexus GX550

2026-01-18

Tech Startups - Tech News, Tech Trends & Startup Funding

AI chip startup Cerebras in talks to raise $1B at $22B valuation ahead of 2026 IPO

2026-01-16

Tech Startups - Tech News, Tech Trends & Startup Funding

Voice AI startup Deepgram raises $130M at $1.3B valuation to fuel global expansion

2026-01-16

Company data provided by crunchbase