| Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions | Jun 7, 2024 | HallucinationMathematical Reasoning | —Unverified | 0 |
| 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination | Jun 7, 2024 | Hallucination | CodeCode Available | 2 |
| Chaos with Keywords: Exposing Large Language Models Sycophantic Hallucination to Misleading Keywords and Evaluating Defense Strategies | Jun 6, 2024 | HallucinationKnowledge Probing | —Unverified | 0 |
| ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints | Jun 6, 2024 | DiagnosticHallucination | —Unverified | 0 |
| Confabulation: The Surprising Value of Large Language Model Hallucinations | Jun 6, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework | Jun 5, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends | Jun 5, 2024 | Hallucination | —Unverified | 0 |
| OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection | Jun 4, 2024 | HallucinationMachine Translation | CodeCode Available | 0 |
| Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs | Jun 4, 2024 | BenchmarkingFairness | —Unverified | 0 |
| CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models | Jun 4, 2024 | HallucinationInformativeness | —Unverified | 0 |
| How to Explore with Belief: State Entropy Maximization in POMDPs | Jun 4, 2024 | Hallucination | —Unverified | 0 |
| Ask-EDA: A Design Assistant Empowered by LLM, Hybrid RAG and Abbreviation De-hallucination | Jun 3, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| Large Language Model Assisted Optimal Bidding of BESS in FCAS Market: An AI-agent based Approach | Jun 3, 2024 | AI AgentDeep Reinforcement Learning | —Unverified | 0 |
| Decompose, Enrich, and Extract! Schema-aware Event Extraction using LLMs | Jun 3, 2024 | Decision MakingEvent Argument Extraction | —Unverified | 0 |
| Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost | Jun 3, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Comprehensive Evaluation of Large Language Models for Topic Modeling | Jun 2, 2024 | HallucinationTopic Models | —Unverified | 0 |
| DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models | May 31, 2024 | HallucinationModel Editing | CodeCode Available | 0 |
| Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training | May 31, 2024 | HallucinationMulti-Task Learning | CodeCode Available | 1 |
| Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts | May 30, 2024 | AllHallucination | —Unverified | 0 |
| Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools | May 30, 2024 | HallucinationRAG | —Unverified | 0 |
| ANAH: Analytical Annotation of Hallucinations in Large Language Models | May 30, 2024 | Generative Question AnsweringHallucination | CodeCode Available | 2 |
| NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models | May 30, 2024 | Hallucination | CodeCode Available | 0 |
| MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification | May 29, 2024 | HallucinationImage Captioning | —Unverified | 0 |
| MASSIVE Multilingual Abstract Meaning Representation: A Dataset and Baselines for Hallucination Detection | May 29, 2024 | Abstract Meaning RepresentationHallucination | —Unverified | 0 |
| Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study | May 29, 2024 | Answer GenerationHallucination | —Unverified | 0 |