SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–49 of 49 papers

Title	Date	Tasks	Status	Hype
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation	Jun 11, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems	May 24, 2024	DiagnosticHallucination	—Unverified	0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning	Apr 19, 2024	HallucinationHallucination Evaluation	—Unverified	0
Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation	Apr 18, 2024	HallucinationHallucination Evaluation	—Unverified	0
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation	Apr 1, 2024	Code GenerationHallucination	—Unverified	0
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models	Feb 24, 2024	HallucinationHallucination Evaluation	—Unverified	0
Do Androids Know They're Only Dreaming of Electric Sheep?	Dec 28, 2023	HallucinationHallucination Evaluation	—Unverified	0
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks	Oct 19, 2023	HallucinationHallucination Evaluation	—Unverified	0

Show:10 25 50

← PrevPage 5 of 5Next →

No leaderboard results yet.