| Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding | Jan 14, 2025 | Embodied Question AnsweringHallucination | CodeCode Available | 4 |
| GPT as a Monte Carlo Language Tree: A Probabilistic Perspective | Jan 13, 2025 | Hallucination | —Unverified | 0 |
| Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering | Jan 11, 2025 | HallucinationQuestion Answering | CodeCode Available | 0 |
| VASparse: Towards Efficient Visual Hallucination Mitigation for Large Vision-Language Model via Visual-Aware Sparsification | Jan 11, 2025 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare | Jan 11, 2025 | DiagnosticEntity Linking | —Unverified | 0 |
| Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea | Jan 10, 2025 | HallucinationMisinformation | —Unverified | 0 |
| ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark | Jan 9, 2025 | FairnessHallucination | CodeCode Available | 1 |
| Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments | Jan 9, 2025 | Conformal PredictionHallucination | —Unverified | 0 |
| Feedback-Driven Vision-Language Alignment with Minimal Human Supervision | Jan 8, 2025 | HallucinationQuestion Answering | —Unverified | 0 |
| RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance | Jan 7, 2025 | HallucinationRAG | —Unverified | 0 |