| FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs | Jan 17, 2025 | HallucinationKnowledge Graphs | —Unverified | 0 |
| ArxEval: Evaluating Retrieval and Generation in Language Models for Scientific Literature | Jan 17, 2025 | HallucinationRetrieval | —Unverified | 0 |
| A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy | Jan 16, 2025 | HallucinationSurvey | —Unverified | 0 |
| HALoGEN: Fantastic LLM Hallucinations and Where to Find Them | Jan 14, 2025 | HallucinationWorld Knowledge | —Unverified | 0 |
| GPT as a Monte Carlo Language Tree: A Probabilistic Perspective | Jan 13, 2025 | Hallucination | —Unverified | 0 |
| MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare | Jan 11, 2025 | DiagnosticEntity Linking | —Unverified | 0 |
| Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering | Jan 11, 2025 | HallucinationQuestion Answering | CodeCode Available | 0 |
| Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea | Jan 10, 2025 | HallucinationMisinformation | —Unverified | 0 |
| Seeing with Partial Certainty: Conformal Prediction for Robotic Scene Recognition in Built Environments | Jan 9, 2025 | Conformal PredictionHallucination | —Unverified | 0 |
| Feedback-Driven Vision-Language Alignment with Minimal Human Supervision | Jan 8, 2025 | HallucinationQuestion Answering | —Unverified | 0 |
| RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance | Jan 7, 2025 | HallucinationRAG | —Unverified | 0 |
| EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models | Jan 6, 2025 | HallucinationVisual Grounding | —Unverified | 0 |
| Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild | Jan 6, 2025 | HallucinationMultimodal Reasoning | CodeCode Available | 0 |
| Foundations of GenIR | Jan 6, 2025 | HallucinationRetrieval-augmented Generation | —Unverified | 0 |
| FlippedRAG: Black-Box Opinion Manipulation Adversarial Attacks to Retrieval-Augmented Generation Models | Jan 6, 2025 | Adversarial AttackHallucination | —Unverified | 0 |
| CHAIR -- Classifier of Hallucination as Improver | Jan 5, 2025 | HallucinationMMLU | CodeCode Available | 0 |
| LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries | Jan 3, 2025 | Hallucinationzero-shot-classification | —Unverified | 0 |
| CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System | Jan 3, 2025 | ChunkingHallucination | —Unverified | 0 |
| Large Language Model-Enhanced Symbolic Reasoning for Knowledge Base Completion | Jan 2, 2025 | DiversityHallucination | —Unverified | 0 |
| Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking | Jan 2, 2025 | HallucinationText Generation | —Unverified | 0 |
| Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection | Jan 2, 2025 | HallucinationSentence | —Unverified | 0 |
| RRHF-V: Ranking Responses to Mitigate Hallucinations in Multimodal Large Language Models with Human Feedback | Jan 1, 2025 | HallucinationImage Comprehension | CodeCode Available | 0 |
| IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models | Jan 1, 2025 | HallucinationMultiple-choice | —Unverified | 0 |
| Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target. | Jan 1, 2025 | AllHallucination | —Unverified | 0 |
| POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation | Jan 1, 2025 | HallucinationReasoning Segmentation | —Unverified | 0 |
| VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models | Jan 1, 2025 | Hallucination | —Unverified | 0 |
| A review of faithfulness metrics for hallucination assessment in Large Language Models | Dec 31, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Distilling Desired Comments for Enhanced Code Review with Large Language Models | Dec 29, 2024 | Dataset DistillationHallucination | —Unverified | 0 |
| HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models | Dec 29, 2024 | HallucinationObject | CodeCode Available | 0 |
| Is Your Text-to-Image Model Robust to Caption Noise? | Dec 27, 2024 | DescriptiveHallucination | —Unverified | 0 |
| An End-to-End Depth-Based Pipeline for Selfie Image Rectification | Dec 26, 2024 | Depth EstimationHallucination | —Unverified | 0 |
| MedHallBench: A New Benchmark for Assessing Hallucination in Medical Large Language Models | Dec 25, 2024 | Hallucinationreinforcement-learning | —Unverified | 0 |
| Improving Factuality with Explicit Working Memory | Dec 24, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| From Hallucinations to Facts: Enhancing Language Models with Curated Knowledge Graphs | Dec 24, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Multimodal Preference Data Synthetic Alignment with Reward Model | Dec 23, 2024 | 2kCaption Generation | CodeCode Available | 0 |
| CiteBART: Learning to Generate Citations for Local Citation Recommendation | Dec 23, 2024 | Citation PredictionCitation Recommendation | CodeCode Available | 0 |
| AlzheimerRAG: Multimodal Retrieval Augmented Generation for PubMed articles | Dec 21, 2024 | ArticlesDecision Making | —Unverified | 0 |
| Logical Consistency of Large Language Models in Fact-checking | Dec 20, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage | Dec 20, 2024 | AttributeBenchmarking | —Unverified | 0 |
| Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation | Dec 19, 2024 | Hallucination | —Unverified | 0 |
| Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation | Dec 19, 2024 | HallucinationRAG | —Unverified | 0 |
| Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling | Dec 19, 2024 | HallucinationText Generation | —Unverified | 0 |
| Query pipeline optimization for cancer patient question answering systems | Dec 19, 2024 | HallucinationPassage Retrieval | —Unverified | 0 |
| A Comparative Study of DSPy Teleprompter Algorithms for Aligning Large Language Models Evaluation Metrics to Human Evaluation | Dec 19, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence | Dec 18, 2024 | HallucinationMultimodal Reasoning | —Unverified | 0 |
| Are LLMs Good Literature Review Writers? Evaluating the Literature Review Writing Ability of Large Language Models | Dec 18, 2024 | Hallucination | —Unverified | 0 |
| When to Speak, When to Abstain: Contrastive Decoding with Abstention | Dec 17, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models | Dec 17, 2024 | HallucinationRAG | —Unverified | 0 |
| What External Knowledge is Preferred by LLMs? Characterizing and Exploring Chain of Evidence in Imperfect Context | Dec 17, 2024 | HallucinationMisinformation | —Unverified | 0 |
| ReXTrust: A Model for Fine-Grained Hallucination Detection in AI-Generated Radiology Reports | Dec 17, 2024 | Hallucination | —Unverified | 0 |