Apply on Employer Site

Reinforce Labs · 6 hours ago

Adversarial Prompt Expert

United States

Contract

Remote

Mid Level

Reinforce Labs is seeking a creative Adversarial Prompt Expert to join their team. In this role, you will design and execute complex jailbreak attempts to identify vulnerabilities in LLMs, while using your background in linguistics or social sciences to uncover hidden biases and document attack vectors for engineering teams.

Computer Software

Responsibilities

Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models

Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss

Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding

Knowledge Loop: Document your "attack vectors" clearly to help our engineering teams patch vulnerabilities

Qualification

LLM UsageCreative Evasion TechniquesOffensive SecurityAnalytical ReportingEthical Handling

Required

Proven ability to navigate complex model restrictions using creative evasion techniques

Heavy LLM Usage — hands-on experience with multiple models (open- and closed-source), comfort experimenting across systems and platforms

You have a 'hacker mindset.' You enjoy the puzzle of finding edge cases and can think of ten different ways to ask a forbidden question

You can turn a chaotic afternoon of prompt-hacking into a clean, actionable report

You understand the weight of this work. You can handle sensitive or 'dark' content professionally and stay within ethical boundaries

Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models

Use your background in linguistics or social sciences to find 'hidden' biases or harms that standard automated filters miss

Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding

Knowledge Loop: Document your 'attack vectors' clearly to help our engineering teams patch vulnerabilities

Preferred

Background in offensive security or red teaming is a major plus

You don't give up when a model says 'I cannot fulfill this request.' You find a new angle

Company

Reinforce Labs

Making the internet a safer place using AI.

2-10 employees

https://www.reinforcelabs.ai

Funding

Current Stage

Early Stage

Company data provided by crunchbase