| ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability | Oct 15, 2024 | HallucinationRAG | —Unverified | 0 |
| LargePiG: Your Large Language Model is Secretly a Pointer Generator | Oct 15, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language Models | Oct 15, 2024 | HallucinationLarge Language Model | CodeCode Available | 0 |
| Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs | Oct 15, 2024 | Hallucination | —Unverified | 0 |
| On the Capacity of Citation Generation by Large Language Models | Oct 15, 2024 | AttributeHallucination | —Unverified | 0 |
| Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions | Oct 15, 2024 | Hallucination | —Unverified | 0 |
| MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation | Oct 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 2 |
| AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data | Oct 15, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Can Structured Data Reduce Epistemic Uncertainty? | Oct 14, 2024 | HallucinationRetrieval | —Unverified | 0 |
| Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning | Oct 14, 2024 | HallucinationRAG | —Unverified | 0 |
| Medico: Towards Hallucination Detection and Correction with Multi-source Evidence Fusion | Oct 14, 2024 | Hallucination | —Unverified | 0 |
| SkillAggregation: Reference-free LLM-Dependent Aggregation | Oct 14, 2024 | ChatbotHallucination | —Unverified | 0 |
| VideoAgent: Self-Improving Video Generation | Oct 14, 2024 | HallucinationVideo Generation | CodeCode Available | 2 |
| LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models | Oct 13, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| Honest AI: Fine-Tuning "Small" Language Models to Say "I Don't Know", and Reducing Hallucination in RAG | Oct 13, 2024 | HallucinationRAG | —Unverified | 0 |
| Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code | Oct 13, 2024 | Code GenerationHallucination | —Unverified | 0 |
| VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment | Oct 12, 2024 | DiversityHallucination | —Unverified | 0 |
| Measuring the Inconsistency of Large Language Models in Preferential Ranking | Oct 11, 2024 | DiagnosticHallucination | —Unverified | 0 |
| A Methodology for Evaluating RAG Systems: A Case Study On Configuration Dependency Validation | Oct 11, 2024 | HallucinationRAG | CodeCode Available | 0 |
| VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding | Oct 11, 2024 | HallucinationMoment Retrieval | CodeCode Available | 1 |
| PublicHearingBR: A Brazilian Portuguese Dataset of Public Hearing Transcripts for Summarization of Long Documents | Oct 10, 2024 | ArticlesDocument Summarization | —Unverified | 0 |
| Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering | Oct 10, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting | Oct 10, 2024 | Entity LinkingFew-Shot Learning | CodeCode Available | 1 |
| Automatic Curriculum Expert Iteration for Reliable LLM Reasoning | Oct 10, 2024 | HallucinationLogical Reasoning | CodeCode Available | 1 |
| LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Oct 10, 2024 | Hallucination | —Unverified | 0 |
| IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking | Oct 9, 2024 | ARCCode Generation | CodeCode Available | 1 |
| From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models | Oct 9, 2024 | AttributeHallucination | —Unverified | 0 |
| Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning | Oct 9, 2024 | HallucinationMultiple-choice | CodeCode Available | 0 |
| Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making | Oct 9, 2024 | BenchmarkingDecision Making | CodeCode Available | 3 |
| EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment | Oct 8, 2024 | cross-modal alignmentHallucination | —Unverified | 0 |
| ReFIR: Grounding Large Restoration Models with Retrieval Augmentation | Oct 8, 2024 | HallucinationImage Restoration | CodeCode Available | 2 |
| Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation | Oct 8, 2024 | Dialogue GenerationHallucination | —Unverified | 0 |
| Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models | Oct 8, 2024 | HallucinationOverall - Test | —Unverified | 0 |
| FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning | Oct 8, 2024 | GSM8KHallucination | —Unverified | 0 |
| Differential Transformer | Oct 7, 2024 | HallucinationIn-Context Learning | CodeCode Available | 2 |
| TLDR: Token-Level Detective Reward Model for Large Vision Language Models | Oct 7, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |
| AI-Enhanced Ethical Hacking: A Linux-Focused Experiment | Oct 7, 2024 | Hallucination | —Unverified | 0 |
| Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality | Oct 7, 2024 | Causal Inferencecounterfactual | CodeCode Available | 2 |
| Mitigating Hallucinations Using Ensemble of Knowledge Graph and Vector Store in Large Language Models to Enhance Mental Health Support | Oct 6, 2024 | Hallucination | —Unverified | 0 |
| DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination | Oct 6, 2024 | AttributeDecoder | —Unverified | 0 |
| DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech | Oct 5, 2024 | HallucinationKnowledge Distillation | —Unverified | 0 |
| TUBench: Benchmarking Large Vision-Language Models on Trustworthiness with Unanswerable Questions | Oct 5, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| SAG: Style-Aligned Article Generation via Model Collaboration | Oct 4, 2024 | HallucinationInstruction Following | —Unverified | 0 |
| Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval Augmented Generation | Oct 4, 2024 | Domain AdaptationHallucination | —Unverified | 0 |
| Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models | Oct 4, 2024 | DecoderHallucination | CodeCode Available | 2 |
| Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models | Oct 4, 2024 | counterfactualData Augmentation | CodeCode Available | 0 |
| FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs | Oct 3, 2024 | Hallucination | —Unverified | 0 |
| Characterizing Context Influence and Hallucination in Summarization | Oct 3, 2024 | Hallucination | CodeCode Available | 0 |
| CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Oct 3, 2024 | Abstractive Text SummarizationHallucination | CodeCode Available | 1 |
| Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization | Oct 3, 2024 | Abstractive Text SummarizationHallucination | CodeCode Available | 0 |