| LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models | Oct 13, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment | Oct 12, 2024 | DiversityHallucination | —Unverified | 0 |
| Measuring the Inconsistency of Large Language Models in Preferential Ranking | Oct 11, 2024 | DiagnosticHallucination | —Unverified | 0 |
| A Methodology for Evaluating RAG Systems: A Case Study On Configuration Dependency Validation | Oct 11, 2024 | HallucinationRAG | CodeCode Available | 0 |
| LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts | Oct 10, 2024 | Hallucination | —Unverified | 0 |
| PublicHearingBR: A Brazilian Portuguese Dataset of Public Hearing Transcripts for Summarization of Long Documents | Oct 10, 2024 | ArticlesDocument Summarization | —Unverified | 0 |
| Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study over Open-ended Question Answering | Oct 10, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning | Oct 9, 2024 | HallucinationMultiple-choice | CodeCode Available | 0 |
| From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models | Oct 9, 2024 | AttributeHallucination | —Unverified | 0 |
| FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning | Oct 8, 2024 | GSM8KHallucination | —Unverified | 0 |
| Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models | Oct 8, 2024 | HallucinationOverall - Test | —Unverified | 0 |
| Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation | Oct 8, 2024 | Dialogue GenerationHallucination | —Unverified | 0 |
| EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment | Oct 8, 2024 | cross-modal alignmentHallucination | —Unverified | 0 |
| AI-Enhanced Ethical Hacking: A Linux-Focused Experiment | Oct 7, 2024 | Hallucination | —Unverified | 0 |
| TLDR: Token-Level Detective Reward Model for Large Vision Language Models | Oct 7, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |
| DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination | Oct 6, 2024 | AttributeDecoder | —Unverified | 0 |
| Mitigating Hallucinations Using Ensemble of Knowledge Graph and Vector Store in Large Language Models to Enhance Mental Health Support | Oct 6, 2024 | Hallucination | —Unverified | 0 |
| DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech | Oct 5, 2024 | HallucinationKnowledge Distillation | —Unverified | 0 |
| TUBench: Benchmarking Large Vision-Language Models on Trustworthiness with Unanswerable Questions | Oct 5, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval Augmented Generation | Oct 4, 2024 | Domain AdaptationHallucination | —Unverified | 0 |
| SAG: Style-Aligned Article Generation via Model Collaboration | Oct 4, 2024 | HallucinationInstruction Following | —Unverified | 0 |
| Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models | Oct 4, 2024 | counterfactualData Augmentation | CodeCode Available | 0 |
| FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs | Oct 3, 2024 | Hallucination | —Unverified | 0 |
| Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization | Oct 3, 2024 | Abstractive Text SummarizationHallucination | CodeCode Available | 0 |
| Characterizing Context Influence and Hallucination in Summarization | Oct 3, 2024 | Hallucination | CodeCode Available | 0 |
| Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration | Oct 2, 2024 | Hallucination | —Unverified | 0 |
| LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models | Oct 2, 2024 | Hallucination | —Unverified | 0 |
| The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs | Oct 2, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation | Oct 2, 2024 | HallucinationRAG | CodeCode Available | 0 |
| VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models | Oct 1, 2024 | Hallucinationtext similarity | —Unverified | 0 |
| ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding | Oct 1, 2024 | Contrastive LearningHallucination | CodeCode Available | 0 |
| HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding | Sep 30, 2024 | HallucinationObject | CodeCode Available | 0 |
| Contrastive Token Learning with Similarity Decay for Repetition Suppression in Machine Translation | Sep 30, 2024 | HallucinationMachine Translation | —Unverified | 0 |
| Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG | Sep 30, 2024 | HallucinationRAG | —Unverified | 0 |
| LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation | Sep 30, 2024 | Code GenerationHallucination | CodeCode Available | 0 |
| MedHalu: Hallucinations in Responses to Healthcare Queries by Large Language Models | Sep 29, 2024 | Hallucination | —Unverified | 0 |
| DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning | Sep 28, 2024 | HallucinationImage Captioning | —Unverified | 0 |
| HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection | Sep 26, 2024 | Hallucination | CodeCode Available | 0 |
| Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts | Sep 25, 2024 | Hallucination | CodeCode Available | 0 |
| RoleBreak: Character Hallucination as a Jailbreak Attack in Role-Playing Systems | Sep 25, 2024 | Hallucination | —Unverified | 0 |
| Enhancing Guardrails for Safe and Secure Healthcare AI | Sep 25, 2024 | HallucinationMisinformation | —Unverified | 0 |
| A Unified Hallucination Mitigation Framework for Large Vision-Language Models | Sep 24, 2024 | HallucinationQuestion Answering | CodeCode Available | 0 |
| Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection | Sep 24, 2024 | HallucinationSemantic Parsing | —Unverified | 0 |
| Long-horizon Embodied Planning with Implicit Logical Inference and Hallucination Mitigation | Sep 24, 2024 | DiversityHallucination | —Unverified | 0 |
| AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For Asthma Patient Support | Sep 24, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework | Sep 24, 2024 | Benchmarkingcounterfactual | CodeCode Available | 0 |
| Planning in the Dark: LLM-Symbolic Planning Pipeline without Experts | Sep 24, 2024 | Hallucination | —Unverified | 0 |
| A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? | Sep 23, 2024 | HallucinationMedQA | —Unverified | 0 |
| Parse Trees Guided LLM Prompt Compression | Sep 23, 2024 | Hallucination | CodeCode Available | 0 |
| Enhancing Scientific Reproducibility Through Automated BioCompute Object Creation Using Retrieval-Augmented Generation from Publications | Sep 23, 2024 | HallucinationLong-Context Understanding | —Unverified | 0 |