Prime Intellect · 3 months ago
Applied Research - Evals & Data
Prime Intellect is building the open superintelligence stack, and they are seeking a professional to work at the intersection of cutting-edge reinforcement learning and applied data. The role involves advancing agent capabilities, building robust infrastructure, and serving as a bridge between customers and research by translating insights into technical requirements.
Agentic AIArtificial Intelligence (AI)Cloud ComputingMachine Learning
Responsibilities
Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety
Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals
Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping
Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities
Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks
Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams
Translate customer insights and evaluation results into roadmap and research direction
Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks
Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows
Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities
Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions
Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making
Extend and integrate with agent frameworks to support evolving feature requests and performance requirements
Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency
Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments
Qualification
Required
Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment
Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines)
Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate)
Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform)
Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL
Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems
Benefits
Competitive Compensation + equity incentives
Flexible Work (remote or San Francisco)
Visa Sponsorship & relocation support
Professional Development budget
Team Off-sites & conference attendance
Company
Prime Intellect
Prime Intellect is a full-stack platform that offers agentic training infrastructure for organizations to train frontier AI using LLMs.
Funding
Current Stage
Early StageTotal Funding
$70.44MKey Investors
Founders Fund
2026-01-15Series Unknown· $49.94M
2025-02-28Seed· $15M
2024-04-22Seed· $5.5M
Recent News
2025-10-09
Company data provided by crunchbase