SIGN IN
Adversarial Prompt Expert jobs in United States
cer-icon
Apply on Employer Site
company-logo

Reinforce Labs · 6 hours ago

Adversarial Prompt Expert

Reinforce Labs is seeking a creative Adversarial Prompt Expert to join their team. In this role, you will design and execute complex jailbreak attempts to identify vulnerabilities in LLMs, while using your background in linguistics or social sciences to uncover hidden biases and document attack vectors for engineering teams.
Computer Software

Responsibilities

Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models
Use your background in linguistics or social sciences to find "hidden" biases or harms that standard automated filters miss
Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding
Knowledge Loop: Document your "attack vectors" clearly to help our engineering teams patch vulnerabilities

Qualification

LLM UsageCreative Evasion TechniquesOffensive SecurityAnalytical ReportingEthical Handling

Required

Proven ability to navigate complex model restrictions using creative evasion techniques
Heavy LLM Usage — hands-on experience with multiple models (open- and closed-source), comfort experimenting across systems and platforms
You have a 'hacker mindset.' You enjoy the puzzle of finding edge cases and can think of ten different ways to ask a forbidden question
You can turn a chaotic afternoon of prompt-hacking into a clean, actionable report
You understand the weight of this work. You can handle sensitive or 'dark' content professionally and stay within ethical boundaries
Design and execute complex jailbreak attempts to identify vulnerabilities in state-of-the-art models
Use your background in linguistics or social sciences to find 'hidden' biases or harms that standard automated filters miss
Model Evaluation: Systematically rank LLM outputs to determine where safety guardrails are failing or succeeding
Knowledge Loop: Document your 'attack vectors' clearly to help our engineering teams patch vulnerabilities

Preferred

Background in offensive security or red teaming is a major plus
You don't give up when a model says 'I cannot fulfill this request.' You find a new angle

Company

Reinforce Labs

twitter
company-logo
Making the internet a safer place using AI.

Funding

Current Stage
Early Stage
Company data provided by crunchbase