| QuantMCP: Grounding Large Language Models in Verifiable Financial Reality | Jun 7, 2025 | Decision MakingFinancial Analysis | —Unverified | 0 |
| Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models | Jun 5, 2025 | DiagnosticHallucination | CodeCode Available | 1 |
| Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification | Jun 5, 2025 | Automated Theorem ProvingHallucination | CodeCode Available | 1 |
| CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection | Jun 5, 2025 | HallucinationNatural Language Inference | —Unverified | 0 |
| When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models | Jun 5, 2025 | HallucinationMisinformation | —Unverified | 0 |
| GOLFer: Smaller LM-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval | Jun 5, 2025 | HallucinationInformation Retrieval | CodeCode Available | 0 |
| On the Fundamental Impossibility of Hallucination Control in Large Language Models | Jun 4, 2025 | Hallucination | —Unverified | 0 |
| CHIME: Conditional Hallucination and Integrated Multi-scale Enhancement for Time Series Diffusion Model | Jun 4, 2025 | DenoisingHallucination | —Unverified | 0 |
| OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis | Jun 4, 2025 | Action GenerationDecision Making | CodeCode Available | 1 |
| Magic Mushroom: A Customizable Benchmark for Fine-grained Analysis of Retrieval Noise Erosion in RAG Systems | Jun 4, 2025 | DenoisingHallucination | —Unverified | 0 |