| AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models | Jun 16, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 3 |
| Graph Retrieval-Augmented Generation: A Survey | Aug 15, 2024 | HallucinationRAG | CodeCode Available | 3 |
| PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models | Feb 2, 2024 | Action GenerationDecision Making | CodeCode Available | 3 |
| Automated Hypothesis Validation with Agentic Sequential Falsifications | Feb 14, 2025 | Decision MakingHallucination | CodeCode Available | 3 |
| RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework | Aug 2, 2024 | BenchmarkingDataset Generation | CodeCode Available | 3 |
| PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models | Mar 8, 2020 | Face HallucinationHallucination | CodeCode Available | 3 |
| HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems | Nov 5, 2024 | HallucinationRAG | CodeCode Available | 3 |
| When Large Language Models Meet Vector Databases: A Survey | Jan 30, 2024 | HallucinationInformation Retrieval | CodeCode Available | 3 |
| MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models | Oct 16, 2024 | DiagnosticHallucination | CodeCode Available | 3 |
| CRAG -- Comprehensive RAG Benchmark | Jun 7, 2024 | HallucinationLanguage Modelling | CodeCode Available | 3 |
| Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent | Nov 5, 2024 | BenchmarkingHallucination | CodeCode Available | 3 |
| Evaluating Hallucinations in Chinese Large Language Models | Oct 5, 2023 | HallucinationQuestion Answering | CodeCode Available | 3 |
| Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | Mar 19, 2024 | Hallucination | CodeCode Available | 3 |
| LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation | Aug 28, 2024 | Computational EfficiencyHallucination | CodeCode Available | 3 |
| RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing | Apr 30, 2024 | Computational EfficiencyHallucination | CodeCode Available | 3 |
| Learning Dynamics of LLM Finetuning | Jul 15, 2024 | Hallucination | CodeCode Available | 3 |
| InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment | Feb 13, 2024 | Hallucination | CodeCode Available | 2 |
| Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions | Dec 20, 2022 | HallucinationQuestion Answering | CodeCode Available | 2 |
| CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs | Jan 28, 2025 | Hallucination | CodeCode Available | 2 |
| Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models | Aug 4, 2024 | Hallucination | CodeCode Available | 2 |
| In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation | Mar 3, 2024 | HallucinationTruthfulQA | CodeCode Available | 2 |
| HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | May 19, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 2 |
| A Diffusion-Based Generative Equalizer for Music Restoration | Mar 27, 2024 | Bandwidth ExtensionHallucination | CodeCode Available | 2 |
| HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation | May 19, 2023 | HallucinationMachine Translation | CodeCode Available | 2 |
| HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models | Oct 23, 2023 | DiagnosticHallucination | CodeCode Available | 2 |