Apply on Employer Site

Fabrion · 2 months ago

ML/AI Research Engineer — Agentic AI Lab (Founding Team)

San Francisco, CA

Full-time

Onsite

Senior Level

Fabrion is backed by 8VC and is building a world-class team to tackle critical infrastructure problems in enterprise AI. The ML/AI Research Engineer will lead the design, training, evaluation, and optimization of agent-native AI models, working at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning.

Artificial Intelligence (AI)Machine Learning

Responsibilities

Fine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured data

Build and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graph

Train agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task data

Develop embedding-based memory and retrieval chains with token-efficient chunking strategies

Create reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)

Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability tools

Contribute to model observability, drift detection, error classification, and alignment

Optimize inference latency and GPU resource utilization across cloud and on-prem environments

Qualification

LLM fine-tuningRAG pipeline optimizationAgent architecture trainingReinforcement learningHuggingFace TransformersLangChainVector databasesKnowledge graphsToken optimizationPythonStartup mindsetCuriosityModel performance ownershipExplainability

Required

Deep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRA

Worked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelines

Comfortable building and maintaining custom training datasets, filters, and eval splits

Understand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantization

Experience building enterprise-grade RAG pipelines integrated with real-time or contextual data

Familiar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)

Experience grounding models with structured data (SQL, graph, metadata) + unstructured sources

Experience training or customizing agent frameworks with multi-step reasoning and memory

Understand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and tools

Familiar with self-correction, multi-agent communication, and agent ops logging

Strong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuning

Experience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)

Startup DNA: resourceful, fast-moving, and capable of working in ambiguity

Deep curiosity about agent-based architectures and real-world enterprise complexity

Comfortable owning model performance end-to-end: from dataset to deployment

Strong instincts around explainability, safety, and continuous improvement

Enjoy pair-designing with product and UX to shape capabilities, not just APIs

Preferred

Bonus: Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systems

LLM Training & Inference: HuggingFace Transformers, DeepSpeed, vLLM, FlashAttention, FSDP, LoRA

Agent Orchestration: LangChain, LangGraph, ReAct, OpenAgents, LlamaIndex

Vector DBs: Weaviate, Qdrant, FAISS, Pinecone, Chroma

Graph Knowledge Systems: Neo4j, Puppygraph, RDF, Gremlin, JSON-LD

Storage & Access: Iceberg, DuckDB, Postgres, Parquet, Delta Lake

Evaluation: OpenLLM Evals, Trulens, Ragas, LangSmith, Weight & Biases

Compute: Ray, Kubernetes, TGI, Sagemaker, LambdaLabs, Modal

Languages: Python (core), optionally Rust (for inference layers) or JS (for UX experimentation)

Benefits

Competitive salary

Meaningful equity (founding tier)

Company

Fabrion

Fabrion is an AI-native platform purpose-built for the new industrial era

Founded in 2025

San Francisco, California, USA

2-10 employees

https://www.fabrion.com/

Funding

Current Stage

Early Stage

Total Funding

unknown

2026-01-01Seed

Company data provided by crunchbase