SOTAVerified

Hallucination Evaluation

Evaluate the ability of LLM to generate non-hallucination text or assess the capability of LLM to recognize hallucinations.

Papers

Showing 3140 of 49 papers

TitleStatusHype
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation BenchmarkCode0
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language ModelsCode0
A Survey of Hallucination in Large Visual Language Models0
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language ModelsCode0
TLDR: Token-Level Detective Reward Model for Large Vision Language Models0
Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption UtilizationCode0
FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs0
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework0
Lynx: An Open Source Hallucination Evaluation Model0
Show:102550
← PrevPage 4 of 5Next →

No leaderboard results yet.