SOTAVerified|Agents Browse Leaderboard About Blog

Red Teaming

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 121–130 of 251 papers

Title	Date	Tasks	Status	Hype
A Safe Harbor for AI Evaluation and Red Teaming	Mar 7, 2024	Red Teaming	—Unverified	0
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI	Sep 23, 2024	Red Teaming	—Unverified	0
AttackGNN: Red-Teaming GNNs in Hardware Security Using Reinforcement Learning	Feb 21, 2024	Graph Neural NetworkRed Teaming	—Unverified	0
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code	Mar 30, 2024	Continual PretrainingLanguage Modelling	—Unverified	0
Automated Red Teaming with GOAT: the Generative Offensive Agent Tester	Oct 2, 2024	Red Teaming	—Unverified	0
Automating Privilege Escalation with Deep Reinforcement Learning	Oct 4, 2021	BIG-bench Machine LearningDeep Reinforcement Learning	—Unverified	0
AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration	Mar 20, 2025	Red Teaming	—Unverified	0
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models	Jan 3, 2025	Red Teaming	—Unverified	0
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming	Feb 22, 2025	DiversityIn-Context Learning	—Unverified	0
Breaking the Global North Stereotype: A Global South-centric Benchmark Dataset for Auditing and Mitigating Biases in Facial Recognition Systems	Jul 22, 2024	Contrastive LearningGender Prediction	—Unverified	0

Show:10 25 50

← PrevPage 13 of 26Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	SUDO	Attack Success Rate	41	—	Unverified