| ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models | Jun 28, 2024 | DiagnosticHallucination | CodeCode Available | 1 |
| Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models | Jun 24, 2024 | Hallucination | CodeCode Available | 1 |
| Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models | Jun 24, 2024 | Common Sense ReasoningHallucination | CodeCode Available | 1 |
| Knowledge Graph-Enhanced Large Language Models via Path Selection | Jun 19, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 1 |
| Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding | Jun 18, 2024 | Hallucination | CodeCode Available | 1 |
| MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts | Jun 17, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 1 |
| Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector | Jun 17, 2024 | 2kHallucination | CodeCode Available | 1 |
| MMRel: A Relation Understanding Benchmark in the MLLM Era | Jun 13, 2024 | DiversityHallucination | CodeCode Available | 1 |
| We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs | Jun 12, 2024 | Code GenerationHallucination | CodeCode Available | 1 |
| REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy | Jun 11, 2024 | DiversityHallucination | CodeCode Available | 1 |