SOTAVerified|Agents Browse Leaderboard About Blog

LLM Jailbreak

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 24 papers

Title	Date	Tasks	Status	Hype	Score
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models	Jun 26, 2024	LLM JailbreakSurvey	CodeCode Available	2	5
Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues	Oct 14, 2024	LLM JailbreakSafety Alignment	CodeCode Available	2	5
JailBreakV: A Benchmark for Assessing the Robustness of MultiModal Large Language Models against Jailbreak Attacks	Apr 3, 2024	LLM Jailbreak	CodeCode Available	2	5
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks	May 20, 2025	LLM JailbreakSafety Alignment	CodeCode Available	2	5
Cognitive Overload Attack:Prompt Injection for Long Context	Oct 15, 2024	In-Context LearningLLM Jailbreak	CodeCode Available	1	5
Automatic Prompt Optimization with "Gradient Descent" and Beam Search	May 4, 2023	LLM Jailbreak	CodeCode Available	1	5
CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models	Jan 2, 2025	BenchmarkingComputer Security	CodeCode Available	1	5
CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal Representations	Jul 8, 2025	Generative Adversarial NetworkLarge Language Model	CodeCode Available	0	5
Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt Generation for Enhanced LLM Content Moderation	Jan 28, 2025	LLM Jailbreak	CodeCode Available	0	5
Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization	May 15, 2024	LLM Jailbreak	CodeCode Available	0	5

Show:10 25 50

← PrevPage 1 of 3Next →

No leaderboard results yet.