SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 4149 of 49 papers

TitleStatusHype
DefAn: Definitive Answer Dataset for LLMs Hallucination EvaluationCode0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination EvaluationCode0
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning0
Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation0
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation0
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models0
Do Androids Know They're Only Dreaming of Electric Sheep?0
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks0
Show:102550
← PrevPage 5 of 5Next →

No leaderboard results yet.