| A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection | Dec 16, 2024 | HallucinationIn-Context Learning | CodeCode Available | 0 |
| CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding | Dec 16, 2024 | HallucinationMultiple-choice | —Unverified | 0 |
| Task-Oriented Dialog Systems for the Senegalese Wolof Language | Dec 15, 2024 | ChatbotHallucination | —Unverified | 0 |
| Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning | Dec 15, 2024 | Hallucination | —Unverified | 0 |
| RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models | Dec 15, 2024 | Autonomous DrivingContrastive Learning | —Unverified | 0 |
| Thinking with Knowledge Graphs: Enhancing LLM Reasoning Through Structured Data | Dec 14, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries | Dec 14, 2024 | BenchmarkingEmbodied Question Answering | —Unverified | 0 |
| Accelerating Retrieval-Augmented Generation | Dec 14, 2024 | CPUHallucination | —Unverified | 0 |
| Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts | Dec 13, 2024 | Hallucination | —Unverified | 0 |
| Benchmarking large language models for materials synthesis: the case of atomic layer deposition | Dec 13, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| TACOMORE: Leveraging the Potential of LLMs in Corpus-based Discourse Analysis with Prompt Engineering | Dec 13, 2024 | ArticlesHallucination | —Unverified | 0 |
| Multi-Task Learning with LLMs for Implicit Sentiment Analysis: Data-level and Task-level Automatic Weight Learning | Dec 12, 2024 | Aspect-Based Sentiment Analysis (ABSA)Hallucination | —Unverified | 0 |
| Hallucination Elimination and Semantic Enhancement Framework for Vision-Language Models in Traffic Scenarios | Dec 10, 2024 | Autonomous DrivingDescriptive | CodeCode Available | 0 |
| HalluCana: Fixing LLM Hallucination with A Canary Lookahead | Dec 10, 2024 | Hallucination | —Unverified | 0 |
| Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study | Dec 9, 2024 | Citation PredictionHallucination | —Unverified | 0 |
| Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models | Dec 9, 2024 | Hallucination | CodeCode Available | 0 |
| Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent | Dec 7, 2024 | HallucinationQuestion Answering | —Unverified | 0 |
| Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization | Dec 6, 2024 | Hallucination | —Unverified | 0 |
| Steps are all you need: Rethinking STEM Education with Prompt Engineering | Dec 6, 2024 | AllHallucination | —Unverified | 0 |
| LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs | Dec 6, 2024 | Entity AlignmentEntity Embeddings | —Unverified | 0 |
| 100% Elimination of Hallucinations on RAGTruth for GPT-4 and GPT-3.5 Turbo | Dec 6, 2024 | HallucinationRAG | —Unverified | 0 |
| Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling | Dec 6, 2024 | document understandingHallucination | —Unverified | 0 |
| TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG | Dec 6, 2024 | ChunkingHallucination | —Unverified | 0 |
| Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models | Dec 6, 2024 | HallucinationOptical Character Recognition (OCR) | —Unverified | 0 |
| Deep priors for satellite image restoration with accurate uncertainties | Dec 5, 2024 | DeblurringDenoising | —Unverified | 0 |
| Reducing Tool Hallucination via Reliability Alignment | Dec 5, 2024 | HallucinationText Generation | —Unverified | 0 |
| GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration | Dec 5, 2024 | AttributeHallucination | —Unverified | 0 |
| VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding | Dec 4, 2024 | HallucinationInstruction Following | —Unverified | 0 |
| Who Brings the Frisbee: Probing Hidden Hallucination Factors in Large Vision-Language Model via Causality Analysis | Dec 4, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| An Evolutionary Large Language Model for Hallucination Mitigation | Dec 3, 2024 | Dataset GenerationHallucination | —Unverified | 0 |
| CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy | Dec 3, 2024 | HallucinationKey Information Extraction | —Unverified | 0 |
| AI Benchmarks and Datasets for LLM Evaluation | Dec 2, 2024 | BenchmarkingDistributed Computing | —Unverified | 0 |
| Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment | Dec 1, 2024 | Action DetectionActivity Detection | CodeCode Available | 0 |
| Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs | Nov 28, 2024 | AttributeHallucination | —Unverified | 0 |
| DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models | Nov 27, 2024 | AttributeHallucination | —Unverified | 0 |
| OPCap:Object-aware Prompting Captioning | Nov 27, 2024 | AttributeDecoder | —Unverified | 0 |
| Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| Meaningless is better: hashing bias-inducing words in LLM prompts improves performance in logical reasoning and statistical learning | Nov 26, 2024 | HallucinationLogical Reasoning | —Unverified | 0 |
| A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| AI2T: Building Trustable AI Tutors by Interactively Teaching a Self-Aware Learning Agent | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| Enhancing Multi-Agent Consensus through Third-Party LLM Integration: Analyzing Uncertainty and Mitigating Hallucinations in Large Language Models | Nov 25, 2024 | Hallucination | —Unverified | 0 |
| Ontology-Constrained Generation of Domain-Specific Clinical Summaries | Nov 23, 2024 | HallucinationText Summarization | CodeCode Available | 0 |
| Leveraging LLMs for Legacy Code Modernization: Challenges and Opportunities for LLM-Generated Documentation | Nov 22, 2024 | Hallucination | —Unverified | 0 |
| Detecting Hallucinations in Virtual Histology with Neural Precursors | Nov 22, 2024 | HallucinationVirtual Staining | —Unverified | 0 |
| ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models | Nov 22, 2024 | HallucinationObject | —Unverified | 0 |
| Sycophancy in Large Language Models: Causes and Mitigations | Nov 22, 2024 | Hallucination | —Unverified | 0 |
| CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs | Nov 19, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Can Open-source LLMs Enhance Data Synthesis for Toxic Detection?: An Experimental Study | Nov 18, 2024 | Data AugmentationHallucination | —Unverified | 0 |
| Mitigating Knowledge Conflicts in Language Model-Driven Question Answering | Nov 18, 2024 | Document SummarizationHallucination | —Unverified | 0 |