Apply on Employer Site

Bespoke Labs · 5 months ago

Member of Technical Staff: RL Environments

Mountain View

Full-time

Onsite

Mid, Senior Level

Bespoke Labs is an applied AI research lab pioneering data and RL environment curation for training and evaluating agents. The role involves developing systematic strategies for creating high-quality RL environments, analyzing agent behavior, and producing benchmark environments for AI agents.

Computer Software

Responsibilities

Develop systematic strategies and recipes for creating high-quality RL environments that effectively train and evaluate agents

Study how LLMs and agents fail across different task types, identifying patterns that inform better environment design

Create benchmark environments that test specific agent capabilities, packaging them for external release on our evaluation platform

Verify environment quality through hands-on testing—training small-scale agents, checking for reward hacking, and analyzing training dynamics

Work with our environment creation pipeline to scale production of validated environments

Analyze agent rollout data to uncover insights about what makes environments challenging, diverse, and pedagogically valuable

Collaborate with the team to ensure benchmarks integrate smoothly into our external-facing dashboards

Establish quality standards and evaluation protocols that maintain high bars as we scale environment production

Qualification

Machine LearningPythonReinforcement LearningData AnalysisCloud PlatformsCuriosityAttention to DetailCollaboration

Required

Strong foundation in machine learning—either through a PhD/MS in ML, CS, or equivalent industry experience

Deep curiosity about agent behavior and failure modes, with ability to form hypotheses and test them systematically

Experience analyzing complex systems and extracting actionable insights from data

Patience and attention to detail for studying agent rollouts and identifying subtle patterns

Proficiency in Python and ML frameworks (PyTorch, JAX, or similar)

Experience with RL concepts and agent training, even if not from a RL background

Ability to design experiments, run training loops, and interpret results

Comfortable working with cloud platforms (GCP, AWS) for running experiments at scale

Can build pipelines and automation to scale research insights into production

Experience with data analysis tools and creating reproducible workflows

Systematic approach to quality verification and testing

Preferred

Hands-on experience with reinforcement learning or agent training systems

Background in data curation, dataset creation, or evaluation benchmark design

Experience with AI safety, robustness testing, or adversarial evaluation

Publications or projects related to RL, agent evaluation, or data-centric AI

Understanding of how to design environments that surface specific failure modes

Experience shipping research artifacts (datasets, benchmarks, evaluation suites) to the community

Benefits

Health coverage

Flexible work arrangements

The opportunity to shape how the AI community evaluates and trains agents

Company

Bespoke Labs

RL for Agents

Mountain View, California, US

2-10 employees

https://bespokelabs.ai/

Funding

Current Stage

Early Stage

Company data provided by crunchbase