| THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models | May 8, 2024 | AttributeData Augmentation | CodeCode Available | 1 |
| CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification | Apr 30, 2024 | Code GenerationHallucination | CodeCode Available | 1 |
| VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models | Apr 22, 2024 | HallucinationInformativeness | CodeCode Available | 1 |
| Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback | Apr 22, 2024 | AttributeHallucination | CodeCode Available | 1 |
| LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation | Apr 22, 2024 | HallucinationRAG | CodeCode Available | 1 |
| Exploring the Transferability of Visual Prompting for Multimodal Large Language Models | Apr 17, 2024 | HallucinationMultimodal Reasoning | CodeCode Available | 1 |
| MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory | Apr 17, 2024 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs | Apr 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations | Apr 15, 2024 | BenchmarkingBias Detection | CodeCode Available | 1 |
| Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration | Apr 15, 2024 | Hallucination | CodeCode Available | 1 |