SOTAVerified

Hallucination

Papers

Showing 251275 of 1816 papers

TitleStatusHype
RARE: Retrieval-Augmented Reasoning ModelingCode2
An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering0
Learning to Instruct for Visual Instruction Tuning0
Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?0
Alleviating LLM-based Generative Retrieval Hallucination in Alipay Search0
Tricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack0
Vision-Amplified Semantic Entropy for Hallucination Detection in Medical Visual Question Answering0
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs0
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy0
GAPO: Learning Preferential Prompt through Generative Adversarial Policy OptimizationCode0
TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes0
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models0
CAFe: Unifying Representation and Generation with Contrastive-Autoregressive FinetuningCode1
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer TextCode1
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and MitigationCode1
HausaNLP at SemEval-2025 Task 3: Towards a Fine-Grained Model-Aware Hallucination Detection0
ShED-HD: A Shannon Entropy Distribution Framework for Lightweight Hallucination Detection on Edge Devices0
GeoBenchX: Benchmarking LLMs for Multistep Geospatial TasksCode1
good4cir: Generating Detailed Synthetic Captions for Composed Image Retrieval0
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs0
Judge Anything: MLLM as a Judge Across Any Modality0
ProDehaze: Prompting Diffusion Models Toward Faithful Image DehazingCode1
ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph0
MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations0
DNR Bench: Benchmarking Over-Reasoning in Reasoning LLMs0
Show:102550
← PrevPage 11 of 73Next →

No leaderboard results yet.