SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 2130 of 49 papers

TitleStatusHype
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained GenerationCode1
Exploring and Evaluating Hallucinations in LLM-Powered Code Generation0
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework0
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models0
TextSquare: Scaling up Text-Centric Visual Instruction Tuning0
TLDR: Token-Level Detective Reward Model for Large Vision Language Models0
Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs0
Lynx: An Open Source Hallucination Evaluation Model0
A Survey of Hallucination in Large Visual Language Models0
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.