| AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models | Jun 16, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 3 | 5 |
| KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | Mar 5, 2024 | HallucinationSelf-Learning | CodeCode Available | 3 | 5 |
| Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | Mar 19, 2024 | Hallucination | CodeCode Available | 3 | 5 |
| MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models | Oct 16, 2024 | DiagnosticHallucination | CodeCode Available | 3 | 5 |
| PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models | Feb 12, 2024 | Answer GenerationHallucination | CodeCode Available | 3 | 5 |
| Automated Hypothesis Validation with Agentic Sequential Falsifications | Feb 14, 2025 | Decision MakingHallucination | CodeCode Available | 3 | 5 |
| HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | May 19, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 2 | 5 |
| HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models | Oct 23, 2023 | DiagnosticHallucination | CodeCode Available | 2 | 5 |
| Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models | Aug 4, 2024 | Hallucination | CodeCode Available | 2 | 5 |
| HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation | May 19, 2023 | HallucinationMachine Translation | CodeCode Available | 2 | 5 |