SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 2130 of 49 papers

TitleStatusHype
Evaluation and Analysis of Hallucination in Large Vision-Language ModelsCode1
HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation0
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM HallucinationsCode0
Mitigating Image Captioning Hallucinations in Vision-Language Models0
Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs0
Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?0
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs0
Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of SummarizationCode0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationCode0
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.