| Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion | Dec 12, 2024 | HallucinationKnowledge Graph Completion | CodeCode Available | 1 |
| Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations | Oct 6, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers | Nov 13, 2023 | Hallucinationknowledge editing | CodeCode Available | 1 |
| Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources | May 22, 2023 | HallucinationLanguage Modelling | CodeCode Available | 1 |
| A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs | May 13, 2025 | HallucinationUncertainty Quantification | CodeCode Available | 1 |
| FineSurE: Fine-grained Summarization Evaluation using LLMs | Jul 1, 2024 | BenchmarkingHallucination | CodeCode Available | 1 |
| Deficiency-Aware Masked Transformer for Video Inpainting | Jul 17, 2023 | HallucinationImage Inpainting | CodeCode Available | 1 |
| Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations | Feb 10, 2024 | DiagnosticHallucination | CodeCode Available | 1 |
| SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency | Nov 3, 2023 | HallucinationQuestion Answering | CodeCode Available | 1 |
| Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification | Jun 5, 2025 | Automated Theorem ProvingHallucination | CodeCode Available | 1 |
| HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data | Nov 22, 2023 | Attributecounterfactual | CodeCode Available | 1 |
| Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation | Sep 29, 2023 | 3D Object DetectionAttribute | CodeCode Available | 1 |
| FaithDial: A Faithful Benchmark for Information-Seeking Dialogue | Apr 22, 2022 | Dialogue GenerationHallucination | CodeCode Available | 1 |
| FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs | Oct 17, 2024 | DiversityHallucination | CodeCode Available | 1 |
| Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers | Oct 16, 2023 | 16kHallucination | CodeCode Available | 1 |
| Analyzing and Mitigating Object Hallucination in Large Vision-Language Models | Oct 1, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations | Apr 15, 2024 | BenchmarkingBias Detection | CodeCode Available | 1 |
| Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback | Apr 22, 2024 | AttributeHallucination | CodeCode Available | 1 |
| Detecting and Preventing Hallucinations in Large Vision Language Models | Aug 11, 2023 | 16kHallucination | CodeCode Available | 1 |
| Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards | May 7, 2025 | BenchmarkingHallucination | CodeCode Available | 1 |
| FactAlign: Long-form Factuality Alignment of Large Language Models | Oct 2, 2024 | FormHallucination | CodeCode Available | 1 |
| FAIR GPT: A virtual consultant for research data management in ChatGPT | Sep 20, 2024 | FairnessHallucination | CodeCode Available | 1 |
| Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding | Jun 18, 2024 | Hallucination | CodeCode Available | 1 |
| Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation | Apr 6, 2022 | Domain GeneralizationHallucination | CodeCode Available | 1 |
| Exploring the Transferability of Visual Prompting for Multimodal Large Language Models | Apr 17, 2024 | HallucinationMultimodal Reasoning | CodeCode Available | 1 |
| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| Extract Free Dense Misalignment from CLIP | Dec 24, 2024 | HallucinationImage Generation | CodeCode Available | 1 |
| AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation | Oct 4, 2023 | HallucinationText Generation | CodeCode Available | 1 |
| Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization | Nov 28, 2023 | HallucinationMME | CodeCode Available | 1 |
| MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts | Jun 17, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 1 |
| Face Hallucination via Split-Attention in Split-Attention Network | Oct 22, 2020 | Face DetectionFace Hallucination | CodeCode Available | 1 |
| Evaluation and Analysis of Hallucination in Large Vision-Language Models | Aug 29, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? | Mar 14, 2024 | Hallucinationimage-classification | CodeCode Available | 1 |
| DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models | Mar 1, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models | Jun 24, 2024 | Hallucination | CodeCode Available | 1 |
| Theory of Mind for Multi-Agent Collaboration via Large Language Models | Oct 16, 2023 | HallucinationMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| EventHallusion: Diagnosing Event Hallucinations in Video LLMs | Sep 25, 2024 | HallucinationInstruction Following | CodeCode Available | 1 |
| DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models | Jun 13, 2025 | AllHallucination | CodeCode Available | 1 |
| Doc2Query--: When Less is More | Jan 9, 2023 | HallucinationRetrieval | CodeCode Available | 1 |
| EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset | Oct 11, 2021 | BenchmarkingFace Hallucination | CodeCode Available | 1 |
| Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering | Sep 19, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation | Jun 9, 2024 | Common Sense ReasoningDenoising | CodeCode Available | 1 |
| Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation | Mar 25, 2025 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| Distinguishing Ignorance from Error in LLM Hallucinations | Oct 29, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 |
| Federated Recommendation via Hybrid Retrieval Augmented Generation | Mar 7, 2024 | HallucinationPrivacy Preserving | CodeCode Available | 1 |
| Hallucinated Neural Radiance Fields in the Wild | Nov 30, 2021 | HallucinationNeRF | CodeCode Available | 1 |
| Label Hallucination for Few-Shot Classification | Dec 6, 2021 | ClassificationFew-Shot Learning | CodeCode Available | 1 |
| PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine | Aug 23, 2023 | Ensemble LearningHallucination | CodeCode Available | 1 |
| Trustworthiness in Retrieval-Augmented Generation Systems: A Survey | Sep 16, 2024 | FairnessHallucination | CodeCode Available | 1 |
| Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation | Apr 18, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |