SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 49 papers

Title	Date	Tasks	Status	Hype
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models	Jun 16, 2024	HallucinationHallucination Evaluation	CodeCode Available	3
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space	Feb 27, 2024	Contrastive LearningHallucination	CodeCode Available	2
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models	Oct 23, 2023	DiagnosticHallucination	CodeCode Available	2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models	Aug 17, 2023	Decision MakingHallucination	CodeCode Available	2
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models	May 19, 2023	HallucinationHallucination Evaluation	CodeCode Available	2
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality	Jun 24, 2025	HallucinationHallucination Evaluation	CodeCode Available	1
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards	May 7, 2025	BenchmarkingHallucination	CodeCode Available	1
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation	Mar 25, 2025	HallucinationHallucination Evaluation	CodeCode Available	1
Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering	Sep 19, 2024	HallucinationHallucination Evaluation	CodeCode Available	1
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models	Aug 18, 2024	AttributeHallucination	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 5Next →

No leaderboard results yet.