| Medico: Towards Hallucination Detection and Correction with Multi-source Evidence Fusion | Oct 14, 2024 | Hallucination | —Unverified | 0 |
| SkillAggregation: Reference-free LLM-Dependent Aggregation | Oct 14, 2024 | ChatbotHallucination | —Unverified | 0 |
| VideoAgent: Self-Improving Video Generation | Oct 14, 2024 | HallucinationVideo Generation | CodeCode Available | 2 |
| LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models | Oct 13, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code | Oct 13, 2024 | Code GenerationHallucination | —Unverified | 0 |
| Honest AI: Fine-Tuning "Small" Language Models to Say "I Don't Know", and Reducing Hallucination in RAG | Oct 13, 2024 | HallucinationRAG | —Unverified | 0 |
| VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment | Oct 12, 2024 | DiversityHallucination | —Unverified | 0 |
| A Methodology for Evaluating RAG Systems: A Case Study On Configuration Dependency Validation | Oct 11, 2024 | HallucinationRAG | CodeCode Available | 0 |
| Measuring the Inconsistency of Large Language Models in Preferential Ranking | Oct 11, 2024 | DiagnosticHallucination | —Unverified | 0 |
| VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding | Oct 11, 2024 | HallucinationMoment Retrieval | CodeCode Available | 1 |