AI Inference Engineer - Speech jobs in United States
cer-icon
Apply on Employer Site
company-logo

Zoom · 20 hours ago

AI Inference Engineer - Speech

Zoom is a company focused on building the best collaboration platform for the enterprise. They are seeking an AI Inference Engineer to develop state-of-the-art automatic speech recognition systems and optimize model inference performance for various Zoom products.

CollaborationInformation TechnologyMessagingSaaSVideo Conferencing
check
H1B Sponsor Likelynote

Responsibilities

Developing state-of-the-art speech services for Zoom products. Devising novel techniques where off-the-shelf solutions are not available
Optimizing ASR inference systems for production deployment, including inference latency, throughput, memory footprint, and resource utilization
Optimizing model inference performance by diving deep into the lower stack of inference frameworks, with a focus on hardware-specific optimizations for Nvidia GPUs
Proposing new model structures by joint optimization of model accuracy and inference speed
Designing and developing ASR systems with low latency and high accuracy requirements, while ensuring scalability of GPU infrastructure and improving throughput of ASR service
Profiling and debugging ASR runtime performance bottlenecks across different deployment hardware and environments

Qualification

Speech recognitionModel inferenceDeep learningNVIDIA GPU optimizationPython programmingCUDATensorFlowPyTorchCollaborationProblem-solving

Required

Possess a Master's in Computer Science, Electrical Engineering or related fields with 3+ years of experience in speech recognition, speech-llm or AI model inference
Display knowledge in deep learning and hands-on programming skills in Python, shell scripts, C/C++; familiarity with ML frameworks such as PyTorch and TensorFlow
Demonstrate deep understanding of transformer encoder-decoder frameworks for speech recognition, including attention mechanisms, beam search and sequence-to-sequence modeling for end-to-end ASR systems
Understand recent advancements in speech foundation models and speech-LLMs that integrate acoustic and linguistic representations, enabling unified modeling for speech understanding and transcription tasks
Have experience in optimizing deep learning model inference on NVIDIA GPUs, including profiling and accelerating AI models using CUDA, TensorRT, and mixed-precision computation to achieve low latency, high-throughput performance
Have experience developing and tuning custom CUDA kernels, leveraging CUDA Graphs for efficient execution scheduling, and minimizing kernel launch overhead to maximize GPU utilization
Be proficient in end-to-end performance analysis, memory optimization, and deployment of largescale ML models on GPU clusters. Experienced with stream management, asynchronous execution, and integrating frameworks such as PyTorch and TensorFlow for real-time inference

Benefits

A variety of perks, benefits, and options to help employees maintain their physical, mental, emotional, and financial health
Support work-life balance
Contribute to their community in meaningful ways

Company

Zoom

twittertwittertwitter
company-logo
Zoom is a software company that offers a communications platform that connects people through video, voice, chat, and content sharing.

H1B Sponsorship

Zoom has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (16)
2024 (178)
2023 (144)
2022 (259)
2021 (86)
2020 (34)

Funding

Current Stage
Public Company
Total Funding
$276M
Key Investors
ARK Investment ManagementSequoia CapitalEmergence Capital
2021-11-04Post Ipo Equity· $130M
2019-04-19Post Ipo Equity
2019-04-18IPO

Leadership Team

leader-logo
Eric Yuan
Founder & CEO
linkedin
leader-logo
Xuedong Huang
Chief Technology Officer
linkedin
Company data provided by crunchbase