SOTAVerified

Hallucination

Papers

Showing 901950 of 1816 papers

TitleStatusHype
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models0
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language ModelsCode1
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion ModelsCode0
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMsCode2
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based FrameworkCode2
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?0
From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment0
HIGHT: Hierarchical Graph Tokenization for Molecule-Language Alignment0
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination0
Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging casesCode2
Knowledge Graph-Enhanced Large Language Models via Path SelectionCode1
StackRAG Agent: Improving Developer Answers with Retrieval-Augmented GenerationCode0
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual ErrorsCode0
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative DecodingCode1
RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation0
What Matters in Memorizing and Recalling Facts? Multifaceted Benchmarks for Knowledge Probing in Language Models0
On-Policy Fine-grained Knowledge Feedback for Hallucination MitigationCode0
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?0
Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models0
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States0
Self-training Large Language Models through Knowledge DetectionCode0
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination DetectorCode1
Mitigating Large Language Model Hallucination with Faithful Finetuning0
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMsCode0
Hallucination Mitigation Prompts Long-term Video UnderstandingCode0
CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation0
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-ExpertsCode1
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language ModelsCode2
mDPO: Conditional Preference Optimization for Multimodal Large Language ModelsCode2
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals0
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded ConversationsCode0
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language ModelsCode3
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models0
MMRel: A Relation Understanding Benchmark in the MLLM EraCode1
Understanding Hallucinations in Diffusion Models through Mode InterpolationCode2
DefAn: Definitive Answer Dataset for LLMs Hallucination EvaluationCode0
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMsCode1
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language ModelsCode2
Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis0
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic EntropyCode1
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image DescriptionsCode2
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree PropagationCode0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination EvaluationCode0
On the Hallucination in Simultaneous Machine TranslationCode0
Progressive Query Expansion for Retrieval Over Cost-constrained Data Sources0
Estimating the Hallucination Rate of Generative AI0
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented GenerationCode1
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation0
CRAG -- Comprehensive RAG BenchmarkCode3
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language ModelsCode1
Show:102550
← PrevPage 19 of 37Next →

No leaderboard results yet.