Policy Manager, Harmful Persuasion jobs in United States
cer-icon
Apply on Employer Site
company-logo

Anthropic · 1 day ago

Policy Manager, Harmful Persuasion

Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking a Safeguards Product Policy Manager for Harmful Persuasion to develop and maintain policies that prevent the misuse of AI systems, with a focus on election integrity and harmful manipulation.

Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
check
H1B Sponsorednote

Responsibilities

Develop and maintain comprehensive policy frameworks for harmful persuasion risks, especially in the context of election integrity, influence operations, and fraud
Design clear, enforceable policy language that can be consistently applied by enforcement teams and translated into technical detection requirements
Design and oversee execution of evaluations to assess the model’s capability to leverage, produce and execute deceptive and harmful persuasive techniques
Write and refine external-facing Usage Policy language that clearly communicates policy violations and restrictions to users and external stakeholders
Develop training guidelines, assessment rubrics, and evaluation protocols
Validate enforcement decisions and automated assessments, providing qualitative analysis and policy guidance on complex edge cases
Coordinate with external experts, civil society organizations, and academic to gather feedback on policy clarity and coverage
Provide policy input on UX design for interventions, ensuring user-facing elements align with policy intent and minimize friction for legitimate use
Contribute to model safety improvements in conjunction with the Finetuning team
Support regulatory compliance efforts including consultations related to the EU AI Act and other emerging AI governance frameworks
Function as an escalation point for complex harmful persuasion cases requiring expert policy judgment

Qualification

Policy developmentElection integrityTrust & safety policyRegulatory compliancePolicy writing skillsCognitive biasesPersuasion theoryPsychological manipulationCollaboration skillsCommunication skills

Required

5+ years of experience in policy development, trust & safety policy, or platform policy with working experience across the following: election integrity, fraud/scams, coordinated inauthentic behavior, influence operations, or misinformation
General knowledge of the global regulatory landscape around election integrity, platform regulation, and digital services accountability
Strong policy writing skills with the ability to translate complex risk frameworks into clear, enforceable guidelines
Experience designing policies and workflows that enable both clear human enforcement decision-making and technical implementation in ML classifiers and detection pipelines
Strong collaboration skills and extensive experience partnering effectively with Engineering, Data Science, Legal, and Policy teams on cross-functional initiatives
Excellent written and verbal communication skills, with the ability to explain complex manipulation tactics and policy rationales to diverse audiences
At least a Bachelor's degree in a related field or equivalent experience

Preferred

Strong familiarity in election integrity, political psychology, information integrity, and democratic resilience research
Knowledge of persuasion theory, influence tactics, cognitive biases, and psychological manipulation techniques
Experience working with EU institutions, regulatory bodies, or policy organizations on AI governance or digital platform regulation
Experience conducting adversarial testing, red teaming, or vulnerability assessments for AI systems or platforms
Familiarity with generative AI capabilities and understanding of how LLMs can be used for personalized persuasion, social engineering, or influence at scale

Benefits

Optional equity donation matching
Generous vacation and parental leave
Flexible working hours

Company

Anthropic

twittertwittertwitter
company-logo
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.

H1B Sponsorship

Anthropic has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)

Funding

Current Stage
Late Stage
Total Funding
$33.74B
Key Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B

Leadership Team

leader-logo
Dario Amodei
CEO & Co-Founder
linkedin
leader-logo
Daniela Amodei
President and co-founder
linkedin
Company data provided by crunchbase