SOTAVerified

LLM Jailbreak

Papers

Showing 124 of 24 papers

TitleStatusHype
Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered CluesCode2
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language ModelsCode2
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking AttacksCode2
JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak AttacksCode2
CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language ModelsCode1
Automatic Prompt Optimization with "Gradient Descent" and Beam SearchCode1
Cognitive Overload Attack:Prompt Injection for Long ContextCode1
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response0
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak0
Efficient Indirect LLM Jailbreak via Multimodal-LLM Jailbreak0
Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks0
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack0
Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles0
HSF: Defending against Jailbreak Attacks with Hidden State Filtering0
LLM Jailbreak Oracle0
POEX: Understanding and Mitigating Policy Executable Jailbreak Attacks against Embodied AI0
SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression0
Self-Deception: Reverse Penetrating the Semantic Firewall of Large Language Models0
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner0
Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt Generation for Enhanced LLM Content ModerationCode0
CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal RepresentationsCode0
Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained OptimizationCode0
SMILES-Prompting: A Novel Approach to LLM Jailbreak Attacks in Chemical SynthesisCode0
SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task LinkageCode0
Show:102550

No leaderboard results yet.