| Phare: A Safety Probe for Large Language Models | May 16, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs | May 13, 2025 | HallucinationUncertainty Quantification | CodeCode Available | 1 |
| Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models | May 11, 2025 | DescriptiveDiagnostic | CodeCode Available | 1 |
| Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards | May 7, 2025 | BenchmarkingHallucination | CodeCode Available | 1 |
| Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering | May 5, 2025 | HallucinationQuestion Answering | CodeCode Available | 1 |
| VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding | May 2, 2025 | Anomaly DetectionCommon Sense Reasoning | CodeCode Available | 1 |
| Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception | Apr 29, 2025 | counterfactualHallucination | CodeCode Available | 1 |
| Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations | Apr 18, 2025 | Hallucination | CodeCode Available | 1 |
| VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models | Apr 17, 2025 | HallucinationVideo Understanding | CodeCode Available | 1 |
| EmbodiedAgent: A Scalable Hierarchical Approach to Overcome Practical Challenge in Multi-Robot Control | Apr 14, 2025 | Hallucination | CodeCode Available | 1 |