SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 49 papers

Title	Date	Tasks	Status	Hype
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation	Nov 26, 2023	BenchmarkingHallucination	CodeCode Available	1
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation	Apr 1, 2024	Code GenerationHallucination	—Unverified	0
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework	Jul 15, 2024	HallucinationHallucination Evaluation	—Unverified	0
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models	Feb 24, 2024	HallucinationHallucination Evaluation	—Unverified	0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning	Apr 19, 2024	HallucinationHallucination Evaluation	—Unverified	0
TLDR: Token-Level Detective Reward Model for Large Vision Language Models	Oct 7, 2024	HallucinationHallucination Evaluation	—Unverified	0
Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs	Apr 30, 2025	HallucinationHallucination Evaluation	—Unverified	0
Lynx: An Open Source Hallucination Evaluation Model	Jul 11, 2024	HallucinationHallucination Evaluation	—Unverified	0
A Survey of Hallucination in Large Visual Language Models	Oct 20, 2024	HallucinationHallucination Evaluation	—Unverified	0
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs	Sep 20, 2024	HallucinationHallucination Evaluation	—Unverified	0

Show:10 25 50

← PrevPage 3 of 5Next →

No leaderboard results yet.