| GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks | Mar 23, 2025 | BenchmarkingHallucination | CodeCode Available | 1 | 5 |
| EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models | Feb 15, 2024 | HallucinationObject Hallucination | CodeCode Available | 1 | 5 |
| Automated Multi-level Preference for MLLMs | May 18, 2024 | Dataset GenerationHallucination | CodeCode Available | 1 | 5 |
| Generating Natural Language Proofs with Verifier-Guided Search | May 25, 2022 | Hallucinationvalid | CodeCode Available | 1 | 5 |
| Know Or Not: a library for evaluating out-of-knowledge base robustness | May 19, 2025 | HallucinationRAG | CodeCode Available | 1 | 5 |
| LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction | Aug 22, 2023 | HallucinationMotion Compensation | CodeCode Available | 1 | 5 |
| Knowledge Graph-based Retrieval-Augmented Generation for Schema Matching | Jan 15, 2025 | HallucinationKnowledge Graphs | CodeCode Available | 1 | 5 |
| Knowledge Graph-Enhanced Large Language Models via Path Selection | Jun 19, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 1 | 5 |
| ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark | Jan 9, 2025 | FairnessHallucination | CodeCode Available | 1 | 5 |
| EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset | Oct 11, 2021 | BenchmarkingFace Hallucination | CodeCode Available | 1 | 5 |
| Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-Generation | Apr 4, 2025 | ClusteringHallucination | CodeCode Available | 1 | 5 |
| All in an Aggregated Image for In-Image Learning | Feb 28, 2024 | AllHallucination | CodeCode Available | 1 | 5 |
| DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation | Jun 9, 2024 | Common Sense ReasoningDenoising | CodeCode Available | 1 | 5 |
| Mitigating Multilingual Hallucination in Large Vision-Language Models | Aug 1, 2024 | Hallucination | CodeCode Available | 1 | 5 |
| Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning | Jan 31, 2023 | HallucinationSemantic Parsing | CodeCode Available | 1 | 5 |
| Distinguishing Ignorance from Error in LLM Hallucinations | Oct 29, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 | 5 |
| Alleviating Hallucinations of Large Language Models through Induced Hallucinations | Dec 25, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 | 5 |
| Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models | Jun 5, 2025 | DiagnosticHallucination | CodeCode Available | 1 | 5 |
| Doc2Query--: When Less is More | Jan 9, 2023 | HallucinationRetrieval | CodeCode Available | 1 | 5 |
| KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection | Oct 13, 2023 | Abstractive Text SummarizationHallucination | CodeCode Available | 1 | 5 |
| DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models | Jun 13, 2025 | AllHallucination | CodeCode Available | 1 | 5 |
| JDocQA: Japanese Document Question Answering Dataset for Generative Language Models | Mar 28, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 | 5 |
| "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters | Oct 13, 2023 | BenchmarkingFairness | CodeCode Available | 1 | 5 |
| Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy | Feb 25, 2024 | HallucinationSentence | CodeCode Available | 1 | 5 |
| Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering | May 5, 2025 | HallucinationQuestion Answering | CodeCode Available | 1 | 5 |