| VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models | Jan 1, 2025 | Hallucination | —Unverified | 0 | 0 |
| VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models | Nov 26, 2024 | Hallucination | —Unverified | 0 | 0 |
| Robust Graph Meta-learning for Weakly-supervised Few-shot Node Classification | Jun 12, 2021 | ClassificationDrug Design | —Unverified | 0 | 0 |
| What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis | Feb 19, 2025 | HallucinationLanguage Modeling | —Unverified | 0 | 0 |
| What does it take to get state of the art in simultaneous speech-to-speech translation? | Sep 2, 2024 | HallucinationManagement | —Unverified | 0 | 0 |
| What External Knowledge is Preferred by LLMs? Characterizing and Exploring Chain of Evidence in Imperfect Context | Dec 17, 2024 | HallucinationMisinformation | —Unverified | 0 | 0 |
| What Matters in Memorizing and Recalling Facts? Multifaceted Benchmarks for Knowledge Probing in Language Models | Jun 18, 2024 | DecoderHallucination | —Unverified | 0 | 0 |
| When Not to Answer: Evaluating Prompts on GPT Models for Effective Abstention in Unanswerable Math Word Problems | Oct 16, 2024 | HallucinationMath | —Unverified | 0 | 0 |
| When Thinking LLMs Lie: Unveiling the Strategic Deception in Representations of Reasoning Models | Jun 5, 2025 | HallucinationMisinformation | —Unverified | 0 | 0 |
| When to Speak, When to Abstain: Contrastive Decoding with Abstention | Dec 17, 2024 | HallucinationQuestion Answering | —Unverified | 0 | 0 |