| HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection | May 1, 2025 | Extractive Question-AnsweringHallucination | —Unverified | 0 |
| Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models | May 1, 2025 | Hallucination | —Unverified | 0 |
| SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation | May 1, 2025 | HallucinationNavigate | CodeCode Available | 0 |
| Efficient and robust 3D blind harmonization for large domain gaps | Apr 30, 2025 | HallucinationImage Harmonization | —Unverified | 0 |
| MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness | Apr 30, 2025 | Hallucination | —Unverified | 0 |
| Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models | Apr 30, 2025 | HallucinationObject | —Unverified | 0 |
| Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs | Apr 30, 2025 | HallucinationHallucination Evaluation | —Unverified | 0 |
| Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception | Apr 29, 2025 | counterfactualHallucination | CodeCode Available | 1 |
| Can LLMs Detect Intrinsic Hallucinations in Paraphrasing and Machine Translation? | Apr 29, 2025 | HallucinationMachine Translation | —Unverified | 0 |
| Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges | Apr 29, 2025 | Code GenerationHallucination | —Unverified | 0 |
| An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination | Apr 28, 2025 | Code GenerationHallucination | —Unverified | 0 |
| Explanatory Summarization with Discourse-Driven Planning | Apr 27, 2025 | HallucinationLay Summarization | —Unverified | 0 |
| Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers | Apr 27, 2025 | HallucinationQuestion Answering | CodeCode Available | 5 |
| Validating Network Protocol Parsers with Traceable RFC Document Interpretation | Apr 25, 2025 | Hallucination | —Unverified | 0 |
| Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction | Apr 24, 2025 | Conformal PredictionHallucination | —Unverified | 0 |
| Toward Personalizing Quantum Computing Education: An Evolutionary LLM-Powered Approach | Apr 24, 2025 | HallucinationLarge Language Model | —Unverified | 0 |
| The Dance of Atoms-De Novo Protein Design with Diffusion Model | Apr 23, 2025 | HallucinationProtein Design | —Unverified | 0 |
| (Im)possibility of Automated Hallucination Detection in Large Language Models | Apr 23, 2025 | HallucinationLanguage Identification | —Unverified | 0 |
| Grounded in Context: Retrieval-Based Method for Hallucination Detection | Apr 22, 2025 | HallucinationNatural Language Inference | —Unverified | 0 |
| Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback | Apr 22, 2025 | Code GenerationHallucination | —Unverified | 0 |
| DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding | Apr 21, 2025 | Hallucination | CodeCode Available | 2 |
| POLYRAG: Integrating Polyviews into Retrieval-Augmented Generation for Medical Applications | Apr 21, 2025 | HallucinationLogical Reasoning | —Unverified | 0 |
| aiXamine: Simplified LLM Safety and Security | Apr 21, 2025 | 2kAdversarial Robustness | —Unverified | 0 |
| ResNetVLLM-2: Addressing ResNetVLLM's Multi-Modal Hallucinations | Apr 20, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models | Apr 19, 2025 | Adversarial AttackAdversarial Defense | —Unverified | 0 |