Xerxes Global · 1 month ago
AI Architect
Xerxes Global is seeking an experienced AI Architect to lead the design, development, and production deployment of autonomous multi-agent systems. The role involves creating complex workflows and ensuring the reliability and performance of AI agents through architecture, engineering, and operational strategies.
FinanceFinancial ServicesVenture Capital
Responsibilities
Design multi-agent architectures (e.g., Supervisor-Worker, Hierarchical Teams) capable of breaking down complex user queries into sub-tasks
Define the state management strategy to ensure agents retain context, memory, and user intent across long-running workflows
Architect robust Retrieval-Augmented Generation (RAG) pipelines that allow agents to query proprietary data with high precision
Select and integrate appropriate LLM orchestration frameworks (e.g., LangGraph, AutoGen, CrewAI) based on use-case requirements
Implement tool-use capabilities (function calling), enabling agents to interact with internal APIs, databases, and third-party SaaS platforms safely
Develop guardrails and steering mechanisms (e.g., NeMo Guardrails, LMQL) to ensure agents stay "on-rails" and avoid hallucinations or unsafe actions
Optimize prompt engineering strategies (Chain-of-Thought, ReAct, Tree of Thoughts) for maximum reliability and minimum latency
Oversee the transition from prototype to production, ensuring code is modular, testable, and scalable
Implement evaluation frameworks (e.g., Ragas, TruLens, DeepEval) to quantitatively measure agent performance, accuracy, and hallucination rates before deployment
Design observability dashboards (using tools like LangSmith, Arize, or Datadog) to trace agent reasoning steps, token usage, and latency in real-time
Manage cost and performance trade-offs, implementing caching strategies and selecting the right model mix (e.g., routing simpler tasks to smaller/cheaper models like GPT-4o-mini or Llama 3)
Qualification
Required
Minimum experience: Experienced
Design multi-agent architectures (e.g., Supervisor-Worker, Hierarchical Teams) capable of breaking down complex user queries into sub-tasks
Define the state management strategy to ensure agents retain context, memory, and user intent across long-running workflows
Architect robust Retrieval-Augmented Generation (RAG) pipelines that allow agents to query proprietary data with high precision
Select and integrate appropriate LLM orchestration frameworks (e.g., LangGraph, AutoGen, CrewAI) based on use-case requirements
Implement tool-use capabilities (function calling), enabling agents to interact with internal APIs, databases, and third-party SaaS platforms safely
Develop guardrails and steering mechanisms (e.g., NeMo Guardrails, LMQL) to ensure agents stay 'on-rails' and avoid hallucinations or unsafe actions
Optimize prompt engineering strategies (Chain-of-Thought, ReAct, Tree of Thoughts) for maximum reliability and minimum latency
Oversee the transition from prototype to production, ensuring code is modular, testable, and scalable
Implement evaluation frameworks (e.g., Ragas, TruLens, DeepEval) to quantitatively measure agent performance, accuracy, and hallucination rates before deployment
Design observability dashboards (using tools like LangSmith, Arize, or Datadog) to trace agent reasoning steps, token usage, and latency in real-time
Manage cost and performance trade-offs, implementing caching strategies and selecting the right model mix (e.g., routing simpler tasks to smaller/cheaper models like GPT-4o-mini or Llama 3)
Expert proficiency in Python; familiarity with TypeScript is a plus
Deep experience with LangChain and specifically agentic libraries like LangGraph, AutoGen, or Semantic Kernel
Experience deploying and managing vector stores like Pinecone, Weaviate, Qdrant, or pgvector
Hands-on experience integrating OpenAI (GPT-4), Anthropic (Claude), and open-source models (via Ollama or vLLM)
Experience containerizing AI applications (Docker, Kubernetes) for cloud deployment (AWS/Azure/GCP)
Familiarity with serverless architectures for handling asynchronous agent tasks
Knowledge of API security standards (OAuth, API Keys) for securing agent tool access
Preferred
Experience fine-tuning small language models (SLMs) for specific domain tasks to reduce costs and improve latency
Background in Graph RAG (using Knowledge Graphs alongside Vector DBs) for better reasoning capabilities
Experience dealing with structured outputs (using Pydantic/Instructor) to force LLMs to return valid JSON/Schematic data
Company
Xerxes Global
Xerxes Global is a capital-raising company that operates in venture capital, software, cloud, and video production services.
H1B Sponsorship
Xerxes Global has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)
Funding
Current Stage
Growth StageLeadership Team
Company data provided by crunchbase