SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 110 of 49 papers

TitleStatusHype
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language ModelsCode3
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language ModelsCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful SpaceCode2
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language ModelsCode2
Alleviating Hallucinations of Large Language Models through Induced HallucinationsCode1
Evaluation and Analysis of Hallucination in Large Vision-Language ModelsCode1
AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination EvaluationCode1
Enhancing LLM's Cognition via StructurizationCode1
Benchmarking LLM Faithfulness in RAG with Evolving LeaderboardsCode1
Show:102550
← PrevPage 1 of 5Next →

No leaderboard results yet.