| Towards Robust Evaluation of STEM Education: Leveraging MLLMs in Project-Based Learning | May 16, 2025 | HallucinationInformation Retrieval | —Unverified | 0 |
| Diverging Towards Hallucination: Detection of Failures in Vision-Language Models via Multi-token Aggregation | May 16, 2025 | DiagnosticHallucination | —Unverified | 0 |
| EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models | May 16, 2025 | Hallucination | CodeCode Available | 0 |
| Phare: A Safety Probe for Large Language Models | May 16, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation | May 16, 2025 | HallucinationRAG | CodeCode Available | 1 |
| DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented Generation | May 15, 2025 | graph constructionHallucination | CodeCode Available | 0 |
| AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges | May 15, 2025 | AI AgentData Summarization | —Unverified | 0 |
| Beyond the Black Box: Interpretability of LLMs in Finance | May 14, 2025 | FairnessHallucination | —Unverified | 0 |
| The Impact of Large Language Models on Task Automation in Manufacturing Services | May 14, 2025 | HallucinationQuestion Answering | —Unverified | 0 |
| A Multimodal Multi-Agent Framework for Radiology Report Generation | May 14, 2025 | DiagnosticHallucination | —Unverified | 0 |