| LLM Internal States Reveal Hallucination Risk Faced With a Query | Jul 3, 2024 | HallucinationResponse Generation | CodeCode Available | 0 |
| FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering | Jul 3, 2024 | HallucinationMulti-hop Question Answering | —Unverified | 0 |
| MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context | Jul 3, 2024 | HallucinationResponse Generation | CodeCode Available | 1 |
| A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation | Jul 3, 2024 | Code GenerationHallucination | —Unverified | 0 |
| Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification | Jul 2, 2024 | Claim VerificationHallucination | —Unverified | 0 |
| Understanding Alignment in Multimodal LLMs: A Comprehensive Study | Jul 2, 2024 | Hallucination | —Unverified | 0 |
| MeMemo: On-device Retrieval Augmentation for Private and Personalized Text Generation | Jul 2, 2024 | HallucinationRAG | CodeCode Available | 2 |
| The Need for Guardrails with Large Language Models in Medical Safety-Critical Settings: An Artificial Intelligence Application in the Pharmacovigilance Ecosystem | Jul 1, 2024 | HallucinationPharmacovigilance | —Unverified | 0 |
| LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation | Jul 1, 2024 | HallucinationUncertainty Quantification | —Unverified | 0 |
| Free-text Rationale Generation under Readability Level Control | Jul 1, 2024 | HallucinationText Generation | —Unverified | 0 |
| Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks | Jul 1, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| FineSurE: Fine-grained Summarization Evaluation using LLMs | Jul 1, 2024 | BenchmarkingHallucination | CodeCode Available | 1 |
| Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP | Jun 30, 2024 | HallucinationImage Comprehension | —Unverified | 0 |
| Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models | Jun 30, 2024 | Hallucinationmultimodal interaction | CodeCode Available | 1 |
| BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science | Jun 29, 2024 | AI AgentClaim Verification | CodeCode Available | 0 |
| GraphArena: Benchmarking Large Language Models on Graph Computational Problems | Jun 29, 2024 | BenchmarkingHallucination | CodeCode Available | 1 |
| PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models | Jun 29, 2024 | HallucinationSentence | —Unverified | 0 |
| A Study on Effect of Reference Knowledge Choice in Generating Technical Content Relevant to SAPPhIRE Model Using Large Language Model | Jun 29, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Applying RLAIF for Code Generation with API-usage in Lightweight LLMs | Jun 28, 2024 | Code GenerationHallucination | —Unverified | 0 |
| ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models | Jun 28, 2024 | DiagnosticHallucination | CodeCode Available | 1 |
| Handling Ontology Gaps in Semantic Parsing | Jun 27, 2024 | HallucinationQuestion Answering | CodeCode Available | 0 |
| From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data | Jun 27, 2024 | HallucinationInformation Retrieval | CodeCode Available | 0 |
| Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation | Jun 26, 2024 | HallucinationKnowledge Base Question Answering | CodeCode Available | 2 |
| Mitigating Hallucination in Fictional Character Role-Play | Jun 25, 2024 | HallucinationWorld Knowledge | CodeCode Available | 0 |
| Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models | Jun 24, 2024 | Hallucination | CodeCode Available | 1 |