SOTAVerified

Hallucination

Papers

Showing 926950 of 1816 papers

TitleStatusHype
CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation0
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-ExpertsCode1
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language ModelsCode2
mDPO: Conditional Preference Optimization for Multimodal Large Language ModelsCode2
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals0
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded ConversationsCode0
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language ModelsCode3
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models0
MMRel: A Relation Understanding Benchmark in the MLLM EraCode1
Understanding Hallucinations in Diffusion Models through Mode InterpolationCode2
DefAn: Definitive Answer Dataset for LLMs Hallucination EvaluationCode0
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMsCode1
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language ModelsCode2
Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis0
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic EntropyCode1
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image DescriptionsCode2
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree PropagationCode0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination EvaluationCode0
On the Hallucination in Simultaneous Machine TranslationCode0
Progressive Query Expansion for Retrieval Over Cost-constrained Data Sources0
Estimating the Hallucination Rate of Generative AI0
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented GenerationCode1
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation0
CRAG -- Comprehensive RAG BenchmarkCode3
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language ModelsCode1
Show:102550
← PrevPage 38 of 73Next →

No leaderboard results yet.