Senior, AI Engineer - Voice Agent Platform - AA, Remote: Costa Rica - Colombia, Full time, Data & AI jobs in United States
cer-icon
Apply on Employer Site
company-logo

Gorilla Logic · 1 month ago

Senior, AI Engineer - Voice Agent Platform - AA, Remote: Costa Rica - Colombia, Full time, Data & AI

Gorilla Logic is a company that builds smarter, faster, and stronger systems focused on cutting-edge AI technologies. They are seeking a Senior AI Engineer to design and implement advanced voice agent systems that can handle complex customer interactions and real-time decision-making.

ConsultingSoftware
check
Growth Opportunities

Responsibilities

Design and implement LangGraph-based agent architectures with multi-turn memory, real-time decision-making, and complex state management
Build autonomous voice agents that handle interruptions, context switching, and live customer interactions
Develop specialized agent types (customer service, sales, routing) with intelligent tool and function calling capabilities
Implement agent evaluation systems using LLM-as-Judge methodologies to assess accuracy, hallucination detection, and goal achievement
Create configurable templates for rapid, multi-tenant deployment and scalability
Integrate and optimize LLM providers (OpenAI GPT-4o/GPT-5, Groq Llama 4, Anthropic Claude) with dynamic model routing and fallback strategies
Apply advanced prompt engineering techniques for voice-first applications, including templating, few-shot learning, and context management
Build streaming LLM pipelines that coordinate sentence-level text generation with real-time text-to-speech synthesis
Develop function calling frameworks for tools like call transfer, conferencing, recording, and external integrations
Implement cost optimization strategies balancing performance, latency, and API usage across thousands of sessions
Build real-time speech-to-text pipelines using Deepgram Nova-3 with voice activity detection and interruption handling
Implement multi-provider text-to-speech orchestration (ElevenLabs, Deepgram, Cartesia) with voice cloning and tone control
Develop low-latency audio streaming over WebSockets with buffering, codec handling, and error recovery
Create dual-channel recording systems with speaker separation for QA and data collection
Optimize end-to-end latency in the STT → LLM → TTS pipeline to achieve natural conversational flow
Extend agents to handle text, voice, and vision inputs using GPT-4o multimodal capabilities
Build cross-modal reasoning systems that combine transcription, context, and visual data
Implement document and image understanding features for real-time reference during conversations
Design evaluation frameworks to assess multimodal performance and interaction quality
Architect event-driven microservices using NATS JetStream for reliable message delivery
Build multi-tenant RPC frameworks with access controls, secrets management, and isolation
Deploy to Kubernetes with autoscaling, health checks, and fault-tolerant design
Implement observability solutions using OpenTelemetry for full pipeline visibility
Create idempotency and reliability mechanisms to handle high concurrency at scale

Qualification

LangChainOpenAI GPT-4o/GPT-5DeepgramNode.jsVoice AI systemsKubernetesTypeScriptPrompt engineeringMultimodal AISoft skills

Required

Proven experience building production-grade agentic AI systems using LangChain, LangGraph, or AutoGPT
Deep understanding of ReAct agent architectures, tool use, memory systems, and multi-agent orchestration
Hands-on integration with LLM APIs such as OpenAI GPT-4o/GPT-5, Anthropic Claude, and Groq Llama 4
Expertise in prompt engineering, few-shot learning, and system prompt optimization
Experience managing function calling pipelines, latency, hallucination control, and streaming responses
2+ years developing voice AI systems with Deepgram, OpenAI Whisper, ElevenLabs, or similar providers
Knowledge of audio codecs (MULAW, PCM), VAD, noise cancellation, and real-time audio streaming
Experience with WebRTC, LiveKit, Twilio, or Telnyx for real-time communications
Familiarity with multimodal AI models like GPT-4o or Gemini for cross-modal reasoning
Strong proficiency in Node.js (22+) and TypeScript, using modern async and event-driven patterns
Experience with Express.js and MongoDB (Mongoose) for high-write and time-series workloads
Knowledge of NATS JetStream, Kafka, or RabbitMQ for message streaming
Skilled in designing RESTful APIs and WebSocket services
Hands-on experience with Kubernetes and Docker for scalable deployments
Familiarity with AWS services such as Secrets Manager and S3
Proficiency in OpenTelemetry for distributed tracing and observability
Experience using GitHub Actions, Kustomize, pnpm workspaces, and Changesets for CI/CD
Understanding of distributed systems fundamentals—idempotency, retries, circuit breakers, and high availability
Experience optimizing end-to-end latency in voice pipelines
Knowledge of Silero VAD, dual-channel recording, and audio data collection strategies
Familiarity with performance profiling, testing AI systems, and cost optimization for large-scale voice agents
Understanding of multi-tenant SaaS security, RBAC, and secrets management
Experience designing for fault tolerance and data isolation at scale

Preferred

Contributions to open-source AI frameworks such as LangChain, LlamaIndex, or Haystack
Published research or blogs on agentic AI, LLM orchestration, or voice AI
Experience with telephony systems (SIP, Twilio, Telnyx, WebRTC)
Proven success optimizing LLM cost and performance in production
Participation in AI safety, evaluation, or red-teaming initiatives
Experience building or debugging agent observability systems

Company

Gorilla Logic

company-logo
Gorilla Logic provides custom application development services on the ground and in the cloud to many of the world's leading

Funding

Current Stage
Late Stage
Total Funding
unknown
Key Investors
Sverica Capital
2018-10-25Private Equity

Leadership Team

leader-logo
Roberto Billa
Chief Financial Officer
linkedin
Company data provided by crunchbase