SOTAVerified

Red Teaming

Papers

Showing 8190 of 251 papers

TitleStatusHype
POEX: Understanding and Mitigating Policy Executable Jailbreak Attacks against Embodied AI0
OpenAI o1 System Card0
AI red-teaming is a sociotechnical challenge: on values, labor, and harms0
Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLMCode0
PrivAgent: Agentic-based Red-teaming for LLM Privacy LeakageCode1
Embodied Red Teaming for Auditing Robotic Foundation Models0
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models0
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMsCode1
LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs0
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal ModelsCode0
Show:102550
← PrevPage 9 of 26Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SUDOAttack Success Rate41Unverified