Apply on Employer Site

Anthropic · 1 day ago

Policy Manager, Harmful Persuasion

San Francisco, CA | New York City, NY

Full-time

Onsite

Senior Level

$245K/yr - $330K/yr

5+ years exp

Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking a Safeguards Product Policy Manager for Harmful Persuasion to develop and maintain policies that prevent the misuse of AI systems, with a focus on election integrity and harmful manipulation.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning

H1B Sponsored

Responsibilities

Develop and maintain comprehensive policy frameworks for harmful persuasion risks, especially in the context of election integrity, influence operations, and fraud

Design clear, enforceable policy language that can be consistently applied by enforcement teams and translated into technical detection requirements

Design and oversee execution of evaluations to assess the model’s capability to leverage, produce and execute deceptive and harmful persuasive techniques

Write and refine external-facing Usage Policy language that clearly communicates policy violations and restrictions to users and external stakeholders

Develop training guidelines, assessment rubrics, and evaluation protocols

Validate enforcement decisions and automated assessments, providing qualitative analysis and policy guidance on complex edge cases

Coordinate with external experts, civil society organizations, and academic to gather feedback on policy clarity and coverage

Provide policy input on UX design for interventions, ensuring user-facing elements align with policy intent and minimize friction for legitimate use

Contribute to model safety improvements in conjunction with the Finetuning team

Support regulatory compliance efforts including consultations related to the EU AI Act and other emerging AI governance frameworks

Function as an escalation point for complex harmful persuasion cases requiring expert policy judgment

Qualification

Policy developmentElection integrityTrust & safety policyRegulatory compliancePolicy writing skillsCognitive biasesPersuasion theoryPsychological manipulationCollaboration skillsCommunication skills

Required

5+ years of experience in policy development, trust & safety policy, or platform policy with working experience across the following: election integrity, fraud/scams, coordinated inauthentic behavior, influence operations, or misinformation

General knowledge of the global regulatory landscape around election integrity, platform regulation, and digital services accountability

Strong policy writing skills with the ability to translate complex risk frameworks into clear, enforceable guidelines

Experience designing policies and workflows that enable both clear human enforcement decision-making and technical implementation in ML classifiers and detection pipelines

Strong collaboration skills and extensive experience partnering effectively with Engineering, Data Science, Legal, and Policy teams on cross-functional initiatives

Excellent written and verbal communication skills, with the ability to explain complex manipulation tactics and policy rationales to diverse audiences

At least a Bachelor's degree in a related field or equivalent experience

Preferred

Strong familiarity in election integrity, political psychology, information integrity, and democratic resilience research

Knowledge of persuasion theory, influence tactics, cognitive biases, and psychological manipulation techniques

Experience working with EU institutions, regulatory bodies, or policy organizations on AI governance or digital platform regulation

Experience conducting adversarial testing, red teaming, or vulnerability assessments for AI systems or platforms

Familiarity with generative AI capabilities and understanding of how LLMs can be used for personalized persuasion, social engineering, or influence at scale

Benefits

Optional equity donation matching

Generous vacation and parental leave

Flexible working hours

Company

Anthropic

Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

Founded in 2021

San Francisco, California, USA

501-1000 employees

https://www.anthropic.com

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (105)

2024 (13)

2023 (3)

2022 (4)

2021 (1)

Funding

Current Stage

Late Stage

Total Funding

$33.74B

Key Investors

Lightspeed Venture PartnersGoogleAmazon

2025-09-02Series F· $13B

2025-05-16Debt Financing· $2.5B

2025-03-03Series E· $3.5B

Leadership Team

Dario Amodei

CEO & Co-Founder

Daniela Amodei

President and co-founder

Recent News

36kr.com

Only 6 months left for coders? Anthropic CEO claims AI will take over all coding and soar to Nobel-level intelligence

2026-01-22

Investing.com

Anthropic’s revenue soars as it secures over $10 billion in funding

2026-01-22

Inc42 Media

AI At The Checkout

2026-01-22

Company data provided by crunchbase