SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 2130 of 49 papers

TitleStatusHype
Evaluating Image Hallucination in Text-to-Image Generation with Question-AnsweringCode1
Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language ModelsCode1
Enhancing LLM's Cognition via StructurizationCode1
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework0
Lynx: An Open Source Hallucination Evaluation Model0
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language ModelsCode3
DefAn: Definitive Answer Dataset for LLMs Hallucination EvaluationCode0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination EvaluationCode0
CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.