Apply on Employer Site

Xerxes Global · 1 month ago

AI Architect

United States

Full-time

Remote

Senior Level

Xerxes Global is seeking an experienced AI Architect to lead the design, development, and production deployment of autonomous multi-agent systems. The role involves creating complex workflows and ensuring the reliability and performance of AI agents through architecture, engineering, and operational strategies.

FinanceFinancial ServicesVenture Capital

H1B Sponsor Likely

Responsibilities

Design multi-agent architectures (e.g., Supervisor-Worker, Hierarchical Teams) capable of breaking down complex user queries into sub-tasks

Define the state management strategy to ensure agents retain context, memory, and user intent across long-running workflows

Architect robust Retrieval-Augmented Generation (RAG) pipelines that allow agents to query proprietary data with high precision

Select and integrate appropriate LLM orchestration frameworks (e.g., LangGraph, AutoGen, CrewAI) based on use-case requirements

Implement tool-use capabilities (function calling), enabling agents to interact with internal APIs, databases, and third-party SaaS platforms safely

Develop guardrails and steering mechanisms (e.g., NeMo Guardrails, LMQL) to ensure agents stay "on-rails" and avoid hallucinations or unsafe actions

Optimize prompt engineering strategies (Chain-of-Thought, ReAct, Tree of Thoughts) for maximum reliability and minimum latency

Oversee the transition from prototype to production, ensuring code is modular, testable, and scalable

Implement evaluation frameworks (e.g., Ragas, TruLens, DeepEval) to quantitatively measure agent performance, accuracy, and hallucination rates before deployment

Design observability dashboards (using tools like LangSmith, Arize, or Datadog) to trace agent reasoning steps, token usage, and latency in real-time

Manage cost and performance trade-offs, implementing caching strategies and selecting the right model mix (e.g., routing simpler tasks to smaller/cheaper models like GPT-4o-mini or Llama 3)

Qualification

PythonLangChainVector DatabasesOpenAI APIDockerKubernetesAPI securityTypeScriptAgentic librariesServerless architecturesFine-tuning SLMsGraph RAGStructured outputs

Required

Minimum experience: Experienced

Design multi-agent architectures (e.g., Supervisor-Worker, Hierarchical Teams) capable of breaking down complex user queries into sub-tasks

Define the state management strategy to ensure agents retain context, memory, and user intent across long-running workflows

Architect robust Retrieval-Augmented Generation (RAG) pipelines that allow agents to query proprietary data with high precision

Select and integrate appropriate LLM orchestration frameworks (e.g., LangGraph, AutoGen, CrewAI) based on use-case requirements

Implement tool-use capabilities (function calling), enabling agents to interact with internal APIs, databases, and third-party SaaS platforms safely

Develop guardrails and steering mechanisms (e.g., NeMo Guardrails, LMQL) to ensure agents stay 'on-rails' and avoid hallucinations or unsafe actions

Optimize prompt engineering strategies (Chain-of-Thought, ReAct, Tree of Thoughts) for maximum reliability and minimum latency

Oversee the transition from prototype to production, ensuring code is modular, testable, and scalable

Implement evaluation frameworks (e.g., Ragas, TruLens, DeepEval) to quantitatively measure agent performance, accuracy, and hallucination rates before deployment

Design observability dashboards (using tools like LangSmith, Arize, or Datadog) to trace agent reasoning steps, token usage, and latency in real-time

Manage cost and performance trade-offs, implementing caching strategies and selecting the right model mix (e.g., routing simpler tasks to smaller/cheaper models like GPT-4o-mini or Llama 3)

Expert proficiency in Python; familiarity with TypeScript is a plus

Deep experience with LangChain and specifically agentic libraries like LangGraph, AutoGen, or Semantic Kernel

Experience deploying and managing vector stores like Pinecone, Weaviate, Qdrant, or pgvector

Hands-on experience integrating OpenAI (GPT-4), Anthropic (Claude), and open-source models (via Ollama or vLLM)

Experience containerizing AI applications (Docker, Kubernetes) for cloud deployment (AWS/Azure/GCP)

Familiarity with serverless architectures for handling asynchronous agent tasks

Knowledge of API security standards (OAuth, API Keys) for securing agent tool access

Preferred

Experience fine-tuning small language models (SLMs) for specific domain tasks to reduce costs and improve latency

Background in Graph RAG (using Knowledge Graphs alongside Vector DBs) for better reasoning capabilities

Experience dealing with structured outputs (using Pydantic/Instructor) to force LLMs to return valid JSON/Schematic data

Company

Xerxes Global

Xerxes Global is a capital-raising company that operates in venture capital, software, cloud, and video production services.

Founded in 2010

Minneapolis, Minnesota, USA

51-200 employees

https://xerxesglobal.com/

H1B Sponsorship

Xerxes Global has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2024 (1)

Funding

Current Stage

Growth Stage

Leadership Team

Ashley (Irvin) McGhee

Recruiting Partner

Emily Callahan

Chief Innovation & Administration Officer

Company data provided by crunchbase