Pocket FM · 1 day ago
Research Scientist TTS
Pocket FM is on a mission to deliver personalized and immersive audio experiences to listeners worldwide. They are seeking an experienced research scientist to drive innovation in long-form content generation and localization, focusing on creating culturally-tailored storytelling experiences and developing state-of-the-art TTS systems.
Responsibilities
Model Development :Design, implement, and optimize modern neural TTS systems, including diffusion- and flow-based architectures, neural codec–based speech generation, and LLM-conditioned or hybrid speech synthesis models for expressive, long-form audio
Speech Controllability : Develop methods for fine-grained control over speech attributes like pitch, rhythm, emotion, and speaker style to enhance storytelling quality
Efficiency & Latency : Optimize models for real-time inference and high-scale production, utilizing techniques like knowledge distillation and model quantization
Multilingual Synthesis : Spearhead research into cross-lingual and multilingual TTS to support global content localization
Quality Evaluation : Design and implement robust evaluation frameworks, including MOS (Mean Opinion Score) and objective metrics, to assess the naturalness and intelligibility of generated speech
Qualification
Required
Demonstrated experience in speech synthesis, digital signal processing (DSP), and audio analysis
Proficiency with speech-specific frameworks and libraries such as Coqui TTS, ESPnet, or NVIDIA NeMo
Hands-on experience with sequence-to-sequence models, GANs, Variational Autoencoders (VAEs), and Diffusion models for audio
Experience in building high-quality audio datasets, including voice cloning, speaker verification, and handling prosody
Master's or PhD degree in Computer Science, Machine Learning, or a related field
Significant Python and applied research experience in industrial settings
Proficiency in frameworks such as PyTorch or TensorFlow
Demonstrated experience in deep learning, especially language modeling with transformers and machine translation
Prior experience working with vector databases, search indices, or other data stores for search and retrieval use cases
Preference for fast-paced, collaborative projects with concrete goals, quantitatively tested through A/B experiments
Published research in peer-reviewed journals and conferences on relevant topics
Company
Pocket FM
Pocket FM creates audio series platforms for long-form audio entertainment.
H1B Sponsorship
Pocket FM has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (1)
Funding
Current Stage
Late StageTotal Funding
$212.52MKey Investors
Silicon Valley BankLightspeed India PartnersTencent
2024-03-15Series D· $103M
2023-05-02Debt Financing· $16M
2022-03-03Series C· $64.83M
Recent News
2025-12-13
2025-12-06
2025-11-11
Company data provided by crunchbase