| Judge Anything: MLLM as a Judge Across Any Modality | Mar 21, 2025 | Hallucination | —Unverified | 0 |
| FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs | Mar 21, 2025 | HallucinationKnowledge Graphs | —Unverified | 0 |
| MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations | Mar 20, 2025 | HallucinationVideo Understanding | —Unverified | 0 |
| DNR Bench: Benchmarking Over-Reasoning in Reasoning LLMs | Mar 20, 2025 | BenchmarkingHallucination | —Unverified | 0 |
| ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph | Mar 20, 2025 | BenchmarkingHallucination | —Unverified | 0 |
| Towards Lighter and Robust Evaluation for Retrieval Augmented Generation | Mar 20, 2025 | HallucinationRAG | CodeCode Available | 0 |
| Poly-FEVER: A Multilingual Fact Verification Benchmark for Hallucination Detection in Large Language Models | Mar 19, 2025 | Fact CheckingFact Verification | —Unverified | 0 |
| R^2: A LLM Based Novel-to-Screenplay Generation Framework with Causal Plot Graphs | Mar 19, 2025 | graph constructionHallucination | —Unverified | 0 |
| MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models | Mar 19, 2025 | Adversarial RobustnessAutonomous Driving | —Unverified | 0 |
| Enhancing LLM Generation with Knowledge Hypergraph for Evidence-Based Medicine | Mar 18, 2025 | HallucinationRAG | —Unverified | 0 |