| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| Extract Free Dense Misalignment from CLIP | Dec 24, 2024 | HallucinationImage Generation | CodeCode Available | 1 |
| AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation | Oct 4, 2023 | HallucinationText Generation | CodeCode Available | 1 |
| Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization | Nov 28, 2023 | HallucinationMME | CodeCode Available | 1 |
| MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts | Jun 17, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 1 |
| Face Hallucination via Split-Attention in Split-Attention Network | Oct 22, 2020 | Face DetectionFace Hallucination | CodeCode Available | 1 |
| Evaluation and Analysis of Hallucination in Large Vision-Language Models | Aug 29, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? | Mar 14, 2024 | Hallucinationimage-classification | CodeCode Available | 1 |
| DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models | Mar 1, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models | Jun 24, 2024 | Hallucination | CodeCode Available | 1 |
| Theory of Mind for Multi-Agent Collaboration via Large Language Models | Oct 16, 2023 | HallucinationMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| EventHallusion: Diagnosing Event Hallucinations in Video LLMs | Sep 25, 2024 | HallucinationInstruction Following | CodeCode Available | 1 |
| DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models | Jun 13, 2025 | AllHallucination | CodeCode Available | 1 |
| Doc2Query--: When Less is More | Jan 9, 2023 | HallucinationRetrieval | CodeCode Available | 1 |
| EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset | Oct 11, 2021 | BenchmarkingFace Hallucination | CodeCode Available | 1 |
| Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering | Sep 19, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation | Jun 9, 2024 | Common Sense ReasoningDenoising | CodeCode Available | 1 |
| Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation | Mar 25, 2025 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| Distinguishing Ignorance from Error in LLM Hallucinations | Oct 29, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 |
| Federated Recommendation via Hybrid Retrieval Augmented Generation | Mar 7, 2024 | HallucinationPrivacy Preserving | CodeCode Available | 1 |
| Hallucinated Neural Radiance Fields in the Wild | Nov 30, 2021 | HallucinationNeRF | CodeCode Available | 1 |
| Label Hallucination for Few-Shot Classification | Dec 6, 2021 | ClassificationFew-Shot Learning | CodeCode Available | 1 |
| PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine | Aug 23, 2023 | Ensemble LearningHallucination | CodeCode Available | 1 |
| Trustworthiness in Retrieval-Augmented Generation Systems: A Survey | Sep 16, 2024 | FairnessHallucination | CodeCode Available | 1 |
| Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation | Apr 18, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |