| Ornithologist: Towards Trustworthy "Reasoning" about Central Bank Communications | May 14, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training | May 13, 2025 | HallucinationLarge Language Model | CodeCode Available | 0 |
| Improving the Reliability of LLMs: Combining CoT, RAG, Self-Consistency, and Self-Verification | May 13, 2025 | HallucinationRAG | —Unverified | 0 |
| Adaptive Schema-aware Event Extraction with Retrieval-Augmented Generation | May 13, 2025 | Event ExtractionHallucination | —Unverified | 0 |
| A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs | May 13, 2025 | HallucinationUncertainty Quantification | CodeCode Available | 1 |
| SEReDeEP: Hallucination Detection in Retrieval-Augmented Models via Semantic Entropy and Context-Parameter Fusion | May 12, 2025 | HallucinationRAG | —Unverified | 0 |
| On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud | May 12, 2025 | GPUHallucination | —Unverified | 0 |
| Multimodal Survival Modeling in the Age of Foundation Models | May 12, 2025 | HallucinationSurvival Prediction | CodeCode Available | 0 |
| Critique Before Thinking: Mitigating Hallucination through Rationale-Augmented Instruction Tuning | May 12, 2025 | HallucinationMultimodal Reasoning | —Unverified | 0 |
| TrumorGPT: Graph-Based Retrieval-Augmented Large Language Model for Fact-Checking | May 11, 2025 | Fact CheckingFew-Shot Learning | —Unverified | 0 |
| Hallucination-Aware Multimodal Benchmark for Gastrointestinal Image Analysis with Large Vision-Language Models | May 11, 2025 | DescriptiveDiagnostic | CodeCode Available | 1 |
| Evolutionary thoughts: integration of large language models and evolutionary algorithms | May 9, 2025 | Evolutionary AlgorithmsHallucination | CodeCode Available | 0 |
| Osiris: A Lightweight Open-Source Hallucination Detection System | May 7, 2025 | HallucinationRAG | —Unverified | 0 |
| Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards | May 7, 2025 | BenchmarkingHallucination | CodeCode Available | 1 |
| Interpretable Zero-shot Learning with Infinite Class Concepts | May 6, 2025 | HallucinationZero-Shot Learning | —Unverified | 0 |
| Mitigating Image Captioning Hallucinations in Vision-Language Models | May 6, 2025 | HallucinationHallucination Evaluation | —Unverified | 0 |
| Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering | May 5, 2025 | HallucinationQuestion Answering | CodeCode Available | 1 |
| UCSC at SemEval-2025 Task 3: Context, Models and Prompt Optimization for Automated Hallucination Detection in LLM Output | May 5, 2025 | Hallucination | CodeCode Available | 0 |
| Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation | May 5, 2025 | Entity DisambiguationHallucination | —Unverified | 0 |
| A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models | May 4, 2025 | AttributeHallucination | —Unverified | 0 |
| SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation | May 4, 2025 | HallucinationText Summarization | —Unverified | 0 |
| Regression is all you need for medical image translation | May 4, 2025 | AllHallucination | CodeCode Available | 0 |
| Multi-agents based User Values Mining for Recommendation | May 2, 2025 | HallucinationRecommendation Systems | —Unverified | 0 |
| VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding | May 2, 2025 | Anomaly DetectionCommon Sense Reasoning | CodeCode Available | 1 |
| Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer | May 2, 2025 | document understandingHallucination | —Unverified | 0 |
| HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection | May 1, 2025 | Extractive Question-AnsweringHallucination | —Unverified | 0 |
| Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models | May 1, 2025 | Hallucination | —Unverified | 0 |
| SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation | May 1, 2025 | HallucinationNavigate | CodeCode Available | 0 |
| Efficient and robust 3D blind harmonization for large domain gaps | Apr 30, 2025 | HallucinationImage Harmonization | —Unverified | 0 |
| MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness | Apr 30, 2025 | Hallucination | —Unverified | 0 |
| Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models | Apr 30, 2025 | HallucinationObject | —Unverified | 0 |
| Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs | Apr 30, 2025 | HallucinationHallucination Evaluation | —Unverified | 0 |
| Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception | Apr 29, 2025 | counterfactualHallucination | CodeCode Available | 1 |
| Can LLMs Detect Intrinsic Hallucinations in Paraphrasing and Machine Translation? | Apr 29, 2025 | HallucinationMachine Translation | —Unverified | 0 |
| Hallucination by Code Generation LLMs: Taxonomy, Benchmarks, Mitigation, and Challenges | Apr 29, 2025 | Code GenerationHallucination | —Unverified | 0 |
| An Automated Reinforcement Learning Reward Design Framework with Large Language Model for Cooperative Platoon Coordination | Apr 28, 2025 | Code GenerationHallucination | —Unverified | 0 |
| Explanatory Summarization with Discourse-Driven Planning | Apr 27, 2025 | HallucinationLay Summarization | —Unverified | 0 |
| Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers | Apr 27, 2025 | HallucinationQuestion Answering | CodeCode Available | 5 |
| Validating Network Protocol Parsers with Traceable RFC Document Interpretation | Apr 25, 2025 | Hallucination | —Unverified | 0 |
| Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction | Apr 24, 2025 | Conformal PredictionHallucination | —Unverified | 0 |
| Toward Personalizing Quantum Computing Education: An Evolutionary LLM-Powered Approach | Apr 24, 2025 | HallucinationLarge Language Model | —Unverified | 0 |
| The Dance of Atoms-De Novo Protein Design with Diffusion Model | Apr 23, 2025 | HallucinationProtein Design | —Unverified | 0 |
| (Im)possibility of Automated Hallucination Detection in Large Language Models | Apr 23, 2025 | HallucinationLanguage Identification | —Unverified | 0 |
| Grounded in Context: Retrieval-Based Method for Hallucination Detection | Apr 22, 2025 | HallucinationNatural Language Inference | —Unverified | 0 |
| Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback | Apr 22, 2025 | Code GenerationHallucination | —Unverified | 0 |
| DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding | Apr 21, 2025 | Hallucination | CodeCode Available | 2 |
| POLYRAG: Integrating Polyviews into Retrieval-Augmented Generation for Medical Applications | Apr 21, 2025 | HallucinationLogical Reasoning | —Unverified | 0 |
| aiXamine: Simplified LLM Safety and Security | Apr 21, 2025 | 2kAdversarial Robustness | —Unverified | 0 |
| ResNetVLLM-2: Addressing ResNetVLLM's Multi-Modal Hallucinations | Apr 20, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models | Apr 19, 2025 | Adversarial AttackAdversarial Defense | —Unverified | 0 |