Apply on Employer Site

New York Technology Partners · 4 days ago

Voice Recognition Engineer – Browser-Based Speech Interfaces

United States

Contract

Remote

Mid Level

3+ years exp

New York Technology Partners is seeking a Voice Recognition Engineer to enhance browser-based speech interfaces. The role involves developing and optimizing voice recognition functionality across multiple browsers, integrating various speech APIs, and ensuring a high-quality user experience.

ConsultingInformation TechnologySoftwareStaffing Agency

H1B Sponsor Likely

Hiring Manager

Jay Doshit

Responsibilities

Develop and optimize voice recognition functionality across Chrome, Edge, Safari, Firefox, and Brave

Ensure consistent performance, compatibility, and user experience across desktop, laptop, mobile, and tablet environments

Customize and extend the Web Speech API and integrate third-party speech frameworks, including (but not limited to):

ElevenLabs (Scribe)

AssemblyAI

Deepgram

OpenAI Whisper API

Google Cloud Speech-to-Text

Microsoft Azure Speech-to-Text / Text-to-Speech

Amazon Transcribe / Polly

Optimize recognition speed, accuracy, and robustness, especially in noisy or low-bandwidth environments

Conduct benchmarking and tuning for real-world usage scenarios across diverse accents, languages, and acoustic conditions

Collaborate with product and design teams to build intuitive, inclusive voice interactions

Support configurable speech duration thresholds and accessibility best practices for users with varying abilities

Partner with technical leads and product managers to align voice capabilities with product roadmap

Support client-facing pilots, demos, and proof-of-concept initiatives

Explore novel approaches to push the boundaries of browser-based voice UI

Qualification

Web Speech APISpeech recognitionJavaScript/TypeScriptCross-browser compatibilityMultilingual supportAudio processingUser experienceCollaborationProblem-solving

Required

Must have hands-on experience with Web Speech API + at least one other commercial speech framework

Implement custom logic for error handling, timeout management, speech completion detection, and multilingual support

Minimum 3+ years of experience in speech recognition, voice UI, or audio processing

Demonstrated work with Web Speech API and at least one of the following: ElevenLabs, AssemblyAI, Deepgram, OpenAI Whisper, Google Cloud STT, Azure Speech, or Amazon Transcribe

Strong JavaScript/TypeScript skills with expertise in browser-based audio capture and processing

Experience testing and debugging across Chrome, Safari, Firefox, Edge, and mobile browsers

Understanding of latency, privacy, and security considerations in client-side voice processing

Preferred

Experience with WebRTC, MediaRecorder API, or AudioContext

Background in natural language understanding (NLU) or voice assistant development

Contributions to open-source speech or accessibility projects

Familiarity with CI/CD testing for cross-browser compatibility

Company

New York Technology Partners

New York Technology Partners is an information technology company that provides IT and engineering services.

Founded in 1999

Iselin, New Jersey, USA

201-500 employees

http://nytp.com

H1B Sponsorship

New York Technology Partners has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (52)

2024 (74)

2023 (59)

2022 (104)

2021 (74)

2020 (115)

Funding

Current Stage

Growth Stage

Leadership Team

Amy Reynolds

Senior Talent Acquisition Partner

Hetal Shah

Technical Recruiter at New York Technology Partners

Company data provided by crunchbase