| Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification | Mar 7, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Effectiveness Assessment of Recent Large Vision-Language Models | Mar 7, 2024 | Anomaly DetectionAttribute | —Unverified | 0 |
| Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem | Mar 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset | Mar 6, 2024 | HallucinationIn-Context Learning | CodeCode Available | 0 |
| KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | Mar 5, 2024 | HallucinationSelf-Learning | CodeCode Available | 3 |
| InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers | Mar 5, 2024 | Hallucination | CodeCode Available | 1 |
| The Claude 3 Model Family: Opus, Sonnet, Haiku | Mar 4, 2024 | 1 Image, 2*2 StitchingArithmetic Reasoning | —Unverified | 0 |
| Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering | Mar 3, 2024 | Claim VerificationGraph Question Answering | —Unverified | 0 |
| Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models | Mar 3, 2024 | Hallucination | —Unverified | 0 |
| CR-LT-KGQA: A Knowledge Graph Question Answering Dataset Requiring Commonsense Reasoning and Long-Tail Knowledge | Mar 3, 2024 | Claim VerificationGraph Question Answering | CodeCode Available | 1 |
| In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation | Mar 3, 2024 | HallucinationTruthfulQA | CodeCode Available | 2 |
| MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection | Mar 1, 2024 | Data AugmentationHallucination | —Unverified | 0 |
| DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models | Mar 1, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding | Mar 1, 2024 | HallucinationObject | CodeCode Available | 2 |
| Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models | Mar 1, 2024 | HallucinationRetrieval | —Unverified | 0 |
| Self-Consistent Decoding for More Factual Open Responses | Mar 1, 2024 | HallucinationResponse Generation | CodeCode Available | 0 |
| Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models | Feb 29, 2024 | Hallucination | —Unverified | 0 |
| The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Feb 29, 2024 | AllHallucination | CodeCode Available | 4 |
| Navigating Hallucinations for Reasoning of Unintentional Activities | Feb 29, 2024 | HallucinationNavigate | —Unverified | 0 |
| Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScore | Feb 28, 2024 | DiversityForm | CodeCode Available | 0 |
| Collaborative decoding of critical tokens for boosting factuality of large language models | Feb 28, 2024 | HallucinationInstruction Following | —Unverified | 0 |
| All in an Aggregated Image for In-Image Learning | Feb 28, 2024 | AllHallucination | CodeCode Available | 1 |
| Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models | Feb 28, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models | Feb 27, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space | Feb 27, 2024 | Contrastive LearningHallucination | CodeCode Available | 2 |