Voice Recognition Engineer – Browser-Based Speech Interfaces jobs in United States
cer-icon
Apply on Employer Site
company-logo

New York Technology Partners · 4 days ago

Voice Recognition Engineer – Browser-Based Speech Interfaces

New York Technology Partners is seeking a Voice Recognition Engineer to enhance browser-based speech interfaces. The role involves developing and optimizing voice recognition functionality across multiple browsers, integrating various speech APIs, and ensuring a high-quality user experience.

ConsultingInformation TechnologySoftwareStaffing Agency
check
H1B Sponsor Likelynote
Hiring Manager
Jay Doshit
linkedin

Responsibilities

Develop and optimize voice recognition functionality across Chrome, Edge, Safari, Firefox, and Brave
Ensure consistent performance, compatibility, and user experience across desktop, laptop, mobile, and tablet environments
Customize and extend the Web Speech API and integrate third-party speech frameworks, including (but not limited to):
ElevenLabs (Scribe)
AssemblyAI
Deepgram
OpenAI Whisper API
Google Cloud Speech-to-Text
Microsoft Azure Speech-to-Text / Text-to-Speech
Amazon Transcribe / Polly
Optimize recognition speed, accuracy, and robustness, especially in noisy or low-bandwidth environments
Conduct benchmarking and tuning for real-world usage scenarios across diverse accents, languages, and acoustic conditions
Collaborate with product and design teams to build intuitive, inclusive voice interactions
Support configurable speech duration thresholds and accessibility best practices for users with varying abilities
Partner with technical leads and product managers to align voice capabilities with product roadmap
Support client-facing pilots, demos, and proof-of-concept initiatives
Explore novel approaches to push the boundaries of browser-based voice UI

Qualification

Web Speech APISpeech recognitionJavaScript/TypeScriptCross-browser compatibilityMultilingual supportAudio processingUser experienceCollaborationProblem-solving

Required

Must have hands-on experience with Web Speech API + at least one other commercial speech framework
Implement custom logic for error handling, timeout management, speech completion detection, and multilingual support
Minimum 3+ years of experience in speech recognition, voice UI, or audio processing
Demonstrated work with Web Speech API and at least one of the following: ElevenLabs, AssemblyAI, Deepgram, OpenAI Whisper, Google Cloud STT, Azure Speech, or Amazon Transcribe
Strong JavaScript/TypeScript skills with expertise in browser-based audio capture and processing
Experience testing and debugging across Chrome, Safari, Firefox, Edge, and mobile browsers
Understanding of latency, privacy, and security considerations in client-side voice processing

Preferred

Experience with WebRTC, MediaRecorder API, or AudioContext
Background in natural language understanding (NLU) or voice assistant development
Contributions to open-source speech or accessibility projects
Familiarity with CI/CD testing for cross-browser compatibility

Company

New York Technology Partners

twittertwittertwitter
company-logo
New York Technology Partners is an information technology company that provides IT and engineering services.

H1B Sponsorship

New York Technology Partners has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (52)
2024 (74)
2023 (59)
2022 (104)
2021 (74)
2020 (115)

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Amy Reynolds
Senior Talent Acquisition Partner
linkedin
leader-logo
Hetal Shah
Technical Recruiter at New York Technology Partners
linkedin
Company data provided by crunchbase