Member of Technical Staff: RL Environments jobs in United States
cer-icon
Apply on Employer Site
company-logo

Bespoke Labs · 5 months ago

Member of Technical Staff: RL Environments

Bespoke Labs is an applied AI research lab pioneering data and RL environment curation for training and evaluating agents. The role involves developing systematic strategies for creating high-quality RL environments, analyzing agent behavior, and producing benchmark environments for AI agents.

Computer Software

Responsibilities

Develop systematic strategies and recipes for creating high-quality RL environments that effectively train and evaluate agents
Study how LLMs and agents fail across different task types, identifying patterns that inform better environment design
Create benchmark environments that test specific agent capabilities, packaging them for external release on our evaluation platform
Verify environment quality through hands-on testing—training small-scale agents, checking for reward hacking, and analyzing training dynamics
Work with our environment creation pipeline to scale production of validated environments
Analyze agent rollout data to uncover insights about what makes environments challenging, diverse, and pedagogically valuable
Collaborate with the team to ensure benchmarks integrate smoothly into our external-facing dashboards
Establish quality standards and evaluation protocols that maintain high bars as we scale environment production

Qualification

Machine LearningPythonReinforcement LearningData AnalysisCloud PlatformsCuriosityAttention to DetailCollaboration

Required

Strong foundation in machine learning—either through a PhD/MS in ML, CS, or equivalent industry experience
Deep curiosity about agent behavior and failure modes, with ability to form hypotheses and test them systematically
Experience analyzing complex systems and extracting actionable insights from data
Patience and attention to detail for studying agent rollouts and identifying subtle patterns
Proficiency in Python and ML frameworks (PyTorch, JAX, or similar)
Experience with RL concepts and agent training, even if not from a RL background
Ability to design experiments, run training loops, and interpret results
Comfortable working with cloud platforms (GCP, AWS) for running experiments at scale
Can build pipelines and automation to scale research insights into production
Experience with data analysis tools and creating reproducible workflows
Systematic approach to quality verification and testing

Preferred

Hands-on experience with reinforcement learning or agent training systems
Background in data curation, dataset creation, or evaluation benchmark design
Experience with AI safety, robustness testing, or adversarial evaluation
Publications or projects related to RL, agent evaluation, or data-centric AI
Understanding of how to design environments that surface specific failure modes
Experience shipping research artifacts (datasets, benchmarks, evaluation suites) to the community

Benefits

Health coverage
Flexible work arrangements
The opportunity to shape how the AI community evaluates and trains agents

Company

Bespoke Labs

twitter
company-logo
RL for Agents

Funding

Current Stage
Early Stage
Company data provided by crunchbase