| Hallucination Mitigation Prompts Long-term Video Understanding | Jun 17, 2024 | Answer GenerationHallucination | CodeCode Available | 0 |
| Self-training Large Language Models through Knowledge Detection | Jun 17, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals | Jun 16, 2024 | Hallucination | —Unverified | 0 |
| Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations | Jun 16, 2024 | HallucinationMisinformation | CodeCode Available | 0 |
| Detecting and Evaluating Medical Hallucinations in Large Vision Language Models | Jun 14, 2024 | HallucinationMedical Visual Question Answering | —Unverified | 0 |
| DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation | Jun 13, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation | Jun 11, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis | Jun 11, 2024 | HallucinationLanguage Modelling | —Unverified | 0 |
| Progressive Query Expansion for Retrieval Over Cost-constrained Data Sources | Jun 11, 2024 | HallucinationRetrieval | —Unverified | 0 |
| On the Hallucination in Simultaneous Machine Translation | Jun 11, 2024 | HallucinationMachine Translation | CodeCode Available | 0 |
| Estimating the Hallucination Rate of Generative AI | Jun 11, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation | Jun 11, 2024 | Hallucination | CodeCode Available | 0 |
| Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation | Jun 8, 2024 | Abstractive Text SummarizationDialogue Generation | —Unverified | 0 |
| Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions | Jun 7, 2024 | HallucinationMathematical Reasoning | —Unverified | 0 |
| Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies | Jun 6, 2024 | HallucinationKnowledge Probing | —Unverified | 0 |
| Confabulation: The Surprising Value of Large Language Model Hallucinations | Jun 6, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints | Jun 6, 2024 | DiagnosticHallucination | —Unverified | 0 |
| Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework | Jun 5, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends | Jun 5, 2024 | Hallucination | —Unverified | 0 |
| Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs | Jun 4, 2024 | BenchmarkingFairness | —Unverified | 0 |
| How to Explore with Belief: State Entropy Maximization in POMDPs | Jun 4, 2024 | Hallucination | —Unverified | 0 |
| CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models | Jun 4, 2024 | HallucinationInformativeness | —Unverified | 0 |
| OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection | Jun 4, 2024 | HallucinationMachine Translation | CodeCode Available | 0 |
| Ask-EDA: A Design Assistant Empowered by LLM, Hybrid RAG and Abbreviation De-hallucination | Jun 3, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| Large Language Model Assisted Optimal Bidding of BESS in FCAS Market: An AI-agent based Approach | Jun 3, 2024 | AI AgentDeep Reinforcement Learning | —Unverified | 0 |
| Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs | Jun 3, 2024 | Decision MakingEvent Argument Extraction | —Unverified | 0 |
| Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost | Jun 3, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Comprehensive Evaluation of Large Language Models for Topic Modeling | Jun 2, 2024 | HallucinationTopic Models | —Unverified | 0 |
| DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models | May 31, 2024 | HallucinationModel Editing | CodeCode Available | 0 |
| NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models | May 30, 2024 | Hallucination | CodeCode Available | 0 |
| Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts | May 30, 2024 | AllHallucination | —Unverified | 0 |
| Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools | May 30, 2024 | HallucinationRAG | —Unverified | 0 |
| MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification | May 29, 2024 | HallucinationImage Captioning | —Unverified | 0 |
| MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection | May 29, 2024 | Abstract Meaning RepresentationHallucination | —Unverified | 0 |
| Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study | May 29, 2024 | Answer GenerationHallucination | —Unverified | 0 |
| LLMs and Memorization: On Quality and Specificity of Copyright Compliance | May 28, 2024 | HallucinationMemorization | CodeCode Available | 0 |
| Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action | May 28, 2024 | Conversational Question AnsweringHallucination | —Unverified | 0 |
| Data-augmented phrase-level alignment for mitigating object hallucination | May 28, 2024 | Data AugmentationHallucination | —Unverified | 0 |
| RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language Models | May 28, 2024 | HallucinationMME | —Unverified | 0 |
| Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings | May 27, 2024 | Domain AdaptationGPU | —Unverified | 0 |
| Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks | May 27, 2024 | HallucinationObject Hallucination | CodeCode Available | 0 |
| GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases | May 25, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization | May 24, 2024 | Hallucination | CodeCode Available | 0 |
| CHARP: Conversation History AwaReness Probing for Knowledge-grounded Dialogue Systems | May 24, 2024 | DiagnosticHallucination | —Unverified | 0 |
| Large Language Model Pruning | May 24, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Scaling Laws for Discriminative Classification in Large Language Models | May 24, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models | May 23, 2024 | HallucinationModel Editing | —Unverified | 0 |
| GameVLM: A Decision-making Framework for Robotic Task Planning Based on Visual Language Models and Zero-sum Games | May 22, 2024 | Code GenerationDecision Making | —Unverified | 0 |
| Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation | May 22, 2024 | Caption GenerationHallucination | —Unverified | 0 |
| CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models | May 22, 2024 | BenchmarkingHallucination | —Unverified | 0 |