Apply on Employer Site

Centific · 2 months ago

AI Safety Research Intern-2

Redmond, Washington

Internship

Hybrid

Intern

Centific is a frontier AI data foundry that empowers clients with safe, scalable AI deployment. The AI Safety Research Intern will focus on advancing AI safety, designing and evaluating attack and defense strategies for LLM jailbreaks, and contributing to the platform's security guarantees through high-impact experiments.

AnalyticsArtificial Intelligence (AI)DatabaseInformation TechnologyRetail Technology

H1B Sponsor Likely

Responsibilities

Advance AI Safety: Design, implement, and evaluate attack and defense strategies for LLM jailbreaks (prompt injection, obfuscation, narrative red teaming)

Evaluate AI Behavior: Analyze and simulate human-AI interaction patterns to uncover behavioral vulnerabilities, social engineering risks, and over-defensive vs. permissive response tradeoffs

Agentic AI Security: Prototype workflows for multi-agent safety (e.g., agent self-checks, regulatory compliance, defense chains) that span perception, reasoning, and action

Benchmark & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety, over-defensiveness, adversarial resilience, and defense effectiveness across diverse models (including latest benchmarks and real-world exploit scenarios)

Deploy and Monitor: Package research into robust, monitorable AI services using modern stacks (Kubernetes, Docker, Ray, FastAPI); integrate safety telemetry, anomaly detection, and continuous red-teaming

Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o, GPT-5, LLaMA, Mistral, Gemma, etc.), uncovering novel exploits and defense gaps

Multi-turn Obfuscation Defense: Implement context-aware, multi-turn attack detection and guardrail mechanisms, including countermeasures for obfuscated prompts (e.g., StringJoin, narrative exploits)

Agent Self-Regulation: Develop agentic architectures for autonomous self-check and self-correct, minimizing risk in complex, multi-agent environments

Human-Centered Safety: Study human behavior models in adversarial contexts—how users probe, trick, or manipulate LLMs, and how defenses can adapt without excessive over-defensiveness

Qualification

PythonPyTorchAI Safety ResearchAdversarial MLMulti-agent architecturesKubernetesDockerFastAPIHuman-AI interactionRed-teamingBenchmarkingGitHub

Required

Ph.D. student in CS/EE/ML/Security (or related); actively publishing in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, BlackHat, IEEE S&P, etc.)

Strong Python and PyTorch/JAX skills; comfort with toolkits for language models, benchmarking, and simulation

Demonstrated research in at least one of: LLM jailbreak attacks/defense, agentic AI safety, human-AI interaction vulnerabilities

Proven ability to go from concept → code → experiment → result, with rigorous tracking and ablation studies

Preferred

Experience in adversarial prompt engineering, jailbreak detection (narrative, obfuscated, sequential attacks)

Prior work on multi-agent architectures or robust defense strategies for LLMs

Familiarity with red-teaming, synthetic behavioral data, and regulatory safety standards

Scalable training and deployment: Ray, distributed evaluation, CI/telemetry for defense protocols

Public code artifacts (GitHub) and first-author publications or strong open-source impact

Company

Centific

Zero distance innovation for GenAI creators and industries Expertly engineering platforms and curating multimodal, multilingual data, we empower the ‘Magnificent Seven’ and enterprise clients with safe, scalable AI deployment We a team of over 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers.

Founded in 2020

Redmond, Washington, USA

5001-10000 employees

https://www.centific.com

H1B Sponsorship

Centific has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (10)

2024 (22)

2023 (14)

Funding

Current Stage

Late Stage

Total Funding

$60M

Key Investors

Granite Asia

2025-06-24Series A· $60M

Leadership Team

Vasudevan Sundarababu

Chief Data and AI Officer

Recent News

Venture Capital Firms

Centific, the Market-Leading Enabler of Advanced AI, Closes Transformative $60M Series A Round

2025-06-26

PR Newswire

Centific to Redefine AI-Powered Video Intelligence Using NVIDIA AI

2025-01-08

Business Standard

IIITH's Raj Reddy Center for Technology and Society receives funding from Centific for AI Project on Automated Malnutrition Detection

2024-05-21

Company data provided by crunchbase