| LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models | Oct 13, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment | Oct 12, 2024 | DiversityHallucination | —Unverified | 0 |
| Measuring the Inconsistency of Large Language Models in Preferential Ranking | Oct 11, 2024 | DiagnosticHallucination | —Unverified | 0 |
| A Methodology for Evaluating RAG Systems: A Case Study On Configuration Dependency Validation | Oct 11, 2024 | HallucinationRAG | CodeCode Available | 0 |
| LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Oct 10, 2024 | Hallucination | —Unverified | 0 |
| PublicHearingBR: A Brazilian Portuguese Dataset of Public Hearing Transcripts for Summarization of Long Documents | Oct 10, 2024 | ArticlesDocument Summarization | —Unverified | 0 |
| Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering | Oct 10, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning | Oct 9, 2024 | HallucinationMultiple-choice | CodeCode Available | 0 |
| From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models | Oct 9, 2024 | AttributeHallucination | —Unverified | 0 |
| FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning | Oct 8, 2024 | GSM8KHallucination | —Unverified | 0 |