| MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation | Oct 5, 2023 | BenchmarkingDecision Making | CodeCode Available | 2 |
| Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement | Mar 31, 2025 | HallucinationRAG | CodeCode Available | 2 |
| Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions | Dec 20, 2022 | HallucinationQuestion Answering | CodeCode Available | 2 |
| Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations | Feb 10, 2024 | DiagnosticHallucination | CodeCode Available | 1 |
| BachGAN: High-Resolution Image Synthesis from Salient Object Layout | Mar 26, 2020 | Generative Adversarial NetworkHallucination | CodeCode Available | 1 |
| FlySearch: Exploring how vision-language models explore | Jun 3, 2025 | HallucinationTask Planning | CodeCode Available | 1 |
| Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception | Apr 29, 2025 | counterfactualHallucination | CodeCode Available | 1 |
| Adversarial Feature Hallucination Networks for Few-Shot Learning | Mar 30, 2020 | Data AugmentationDiversity | CodeCode Available | 1 |
| Balanced Classification: A Unified Framework for Long-Tailed Object Detection | Aug 4, 2023 | HallucinationLong-tailed Object Detection | CodeCode Available | 1 |
| Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation | May 16, 2025 | HallucinationRAG | CodeCode Available | 1 |