SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 49 papers

Title	Date	Tasks	Status	Hype
Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering	Sep 19, 2024	HallucinationHallucination Evaluation	CodeCode Available	1
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models	Aug 18, 2024	AttributeHallucination	CodeCode Available	1
Enhancing LLM's Cognition via Structurization	Jul 23, 2024	HallucinationHallucination Evaluation	CodeCode Available	1
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework	Jul 15, 2024	HallucinationHallucination Evaluation	—Unverified	0
Lynx: An Open Source Hallucination Evaluation Model	Jul 11, 2024	HallucinationHallucination Evaluation	—Unverified	0
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models	Jun 16, 2024	HallucinationHallucination Evaluation	CodeCode Available	3
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation	Jun 11, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems	May 24, 2024	DiagnosticHallucination	—Unverified	0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning	Apr 19, 2024	HallucinationHallucination Evaluation	—Unverified	0

Show:10 25 50

← PrevPage 3 of 5Next →

No leaderboard results yet.