SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–49 of 49 papers

Title	Date	Tasks	Status	Hype
Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of Summarization	Mar 3, 2025	HallucinationHallucination Evaluation	CodeCode Available	0
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models	Oct 30, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption Utilization	Sep 22, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations	May 20, 2025	Fact CheckingHallucination	CodeCode Available	0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine	Nov 14, 2024	FormHallucination	CodeCode Available	0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation	Feb 19, 2025	Dataset GenerationGSM8K	CodeCode Available	0
DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation Benchmark	Nov 5, 2024	Data AugmentationHallucination	CodeCode Available	0
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models	Oct 13, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	0

Show:10 25 50

← PrevPage 5 of 5Next →

No leaderboard results yet.