Anthropic · 1 day ago
Policy Manager, Harmful Persuasion
Anthropic is a public benefit corporation focused on creating reliable and interpretable AI systems. They are seeking a Safeguards Product Policy Manager for Harmful Persuasion to develop and maintain policies that prevent the misuse of AI systems, with a focus on election integrity and harmful manipulation.
Artificial Intelligence (AI)Foundational AIGenerative AIInformation TechnologyMachine Learning
Responsibilities
Develop and maintain comprehensive policy frameworks for harmful persuasion risks, especially in the context of election integrity, influence operations, and fraud
Design clear, enforceable policy language that can be consistently applied by enforcement teams and translated into technical detection requirements
Design and oversee execution of evaluations to assess the model’s capability to leverage, produce and execute deceptive and harmful persuasive techniques
Write and refine external-facing Usage Policy language that clearly communicates policy violations and restrictions to users and external stakeholders
Develop training guidelines, assessment rubrics, and evaluation protocols
Validate enforcement decisions and automated assessments, providing qualitative analysis and policy guidance on complex edge cases
Coordinate with external experts, civil society organizations, and academic to gather feedback on policy clarity and coverage
Provide policy input on UX design for interventions, ensuring user-facing elements align with policy intent and minimize friction for legitimate use
Contribute to model safety improvements in conjunction with the Finetuning team
Support regulatory compliance efforts including consultations related to the EU AI Act and other emerging AI governance frameworks
Function as an escalation point for complex harmful persuasion cases requiring expert policy judgment
Qualification
Required
5+ years of experience in policy development, trust & safety policy, or platform policy with working experience across the following: election integrity, fraud/scams, coordinated inauthentic behavior, influence operations, or misinformation
General knowledge of the global regulatory landscape around election integrity, platform regulation, and digital services accountability
Strong policy writing skills with the ability to translate complex risk frameworks into clear, enforceable guidelines
Experience designing policies and workflows that enable both clear human enforcement decision-making and technical implementation in ML classifiers and detection pipelines
Strong collaboration skills and extensive experience partnering effectively with Engineering, Data Science, Legal, and Policy teams on cross-functional initiatives
Excellent written and verbal communication skills, with the ability to explain complex manipulation tactics and policy rationales to diverse audiences
At least a Bachelor's degree in a related field or equivalent experience
Preferred
Strong familiarity in election integrity, political psychology, information integrity, and democratic resilience research
Knowledge of persuasion theory, influence tactics, cognitive biases, and psychological manipulation techniques
Experience working with EU institutions, regulatory bodies, or policy organizations on AI governance or digital platform regulation
Experience conducting adversarial testing, red teaming, or vulnerability assessments for AI systems or platforms
Familiarity with generative AI capabilities and understanding of how LLMs can be used for personalized persuasion, social engineering, or influence at scale
Benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Company
Anthropic
Anthropic is an AI research company that focuses on the safety and alignment of AI systems with human values.
H1B Sponsorship
Anthropic has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (105)
2024 (13)
2023 (3)
2022 (4)
2021 (1)
Funding
Current Stage
Late StageTotal Funding
$33.74BKey Investors
Lightspeed Venture PartnersGoogleAmazon
2025-09-02Series F· $13B
2025-05-16Debt Financing· $2.5B
2025-03-03Series E· $3.5B
Recent News
2026-01-22
Inc42 Media
2026-01-22
Company data provided by crunchbase