| Optimal Transport for Unsupervised Hallucination Detection in Neural Machine Translation | Dec 19, 2022 | DecoderHallucination | CodeCode Available | 0 |
| Correction with Backtracking Reduces Hallucination in Summarization | Oct 24, 2023 | Abstractive Text SummarizationHallucination | CodeCode Available | 0 |
| Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models | Aug 9, 2024 | Hallucination | CodeCode Available | 0 |
| Few-shot learning via tensor hallucination | Apr 19, 2021 | Data AugmentationFew-Shot Learning | CodeCode Available | 0 |
| Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling | May 1, 2024 | HallucinationTopic Classification | CodeCode Available | 0 |
| OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection | Jun 4, 2024 | HallucinationMachine Translation | CodeCode Available | 0 |
| Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation Oncology | Apr 24, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 |
| JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images | Sep 19, 2024 | HallucinationImage Captioning | CodeCode Available | 0 |
| Joint stereo 3D object detection and implicit surface reconstruction | Nov 25, 2021 | 3D Object DetectionHallucination | CodeCode Available | 0 |
| Abstract Meaning Representation for Hospital Discharge Summarization | Jun 17, 2025 | Abstract Meaning RepresentationHallucination | CodeCode Available | 0 |
| Iterative Teaching by Data Hallucination | Oct 31, 2022 | Hallucination | CodeCode Available | 0 |
| PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning | May 23, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 0 |
| Fakes of Varying Shades: How Warning Affects Human Perception and Engagement Regarding LLM Hallucinations | Apr 4, 2024 | HallucinationHuman Detection | CodeCode Available | 0 |
| Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systems | Mar 12, 2024 | Domain AdaptationHallucination | CodeCode Available | 0 |
| Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models | Nov 13, 2023 | HallucinationMachine Translation | CodeCode Available | 0 |
| AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis | Apr 1, 2024 | Binary ClassificationHallucination | CodeCode Available | 0 |
| Conversational Gold: Evaluating Personalized Conversational Search System using Gold Nuggets | Mar 12, 2025 | Answer GenerationConversational Search | CodeCode Available | 0 |
| Parse Trees Guided LLM Prompt Compression | Sep 23, 2024 | Hallucination | CodeCode Available | 0 |
| Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework | Sep 24, 2024 | Benchmarkingcounterfactual | CodeCode Available | 0 |
| Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction | Oct 19, 2024 | counterfactualCounterfactual Explanation | CodeCode Available | 0 |
| Evolutionary thoughts: integration of large language models and evolutionary algorithms | May 9, 2025 | Evolutionary AlgorithmsHallucination | CodeCode Available | 0 |
| Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models | Oct 4, 2024 | counterfactualData Augmentation | CodeCode Available | 0 |
| Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering | Apr 22, 2024 | HallucinationPrompt Engineering | CodeCode Available | 0 |
| Confidence Estimation for LLM-Based Dialogue State Tracking | Sep 15, 2024 | Dialogue State TrackingHallucination | CodeCode Available | 0 |
| Instruction Makes a Difference | Feb 1, 2024 | HallucinationInstruction Following | CodeCode Available | 0 |
| SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination | Apr 7, 2024 | HallucinationMachine Translation | CodeCode Available | 0 |
| Incorporating Task-specific Concept Knowledge into Script Learning | Aug 31, 2022 | Contrastive LearningHallucination | CodeCode Available | 0 |
| Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification | Nov 15, 2023 | HallucinationRetrieval | CodeCode Available | 0 |
| SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation | May 1, 2025 | HallucinationNavigate | CodeCode Available | 0 |
| Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators | Aug 22, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 0 |
| PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems | Sep 19, 2023 | HallucinationLanguage Modelling | CodeCode Available | 0 |
| Towards Lighter and Robust Evaluation for Retrieval Augmented Generation | Mar 20, 2025 | HallucinationRAG | CodeCode Available | 0 |
| Improving Factual Error Correction by Learning to Inject Factual Errors | Dec 12, 2023 | Hallucination | CodeCode Available | 0 |
| SmurfCat at SemEval-2024 Task 6: Leveraging Synthetic Data for Hallucination Detection | Apr 9, 2024 | Hallucination | CodeCode Available | 0 |
| Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild | Jan 6, 2025 | HallucinationMultimodal Reasoning | CodeCode Available | 0 |
| Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations | Apr 17, 2025 | DecoderHallucination | CodeCode Available | 0 |
| Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training | Oct 14, 2022 | HallucinationImage Augmentation | CodeCode Available | 0 |
| Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models | May 7, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 0 |
| Image Denoising with Control over Deep Network Hallucination | Jan 2, 2022 | DenoisingHallucination | CodeCode Available | 0 |
| Im2Flow: Motion Hallucination from Static Images for Action Recognition | Dec 12, 2017 | Action RecognitionActivity Recognition | CodeCode Available | 0 |
| PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics | Apr 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning | Oct 9, 2024 | HallucinationMultiple-choice | CodeCode Available | 0 |
| Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision | May 26, 2025 | HallucinationMath | CodeCode Available | 0 |
| A Benchmark and Robustness Study of In-Context-Learning with Large Language Models in Music Entity Detection | Dec 16, 2024 | HallucinationIn-Context Learning | CodeCode Available | 0 |
| Spurious reconstruction from brain activity | May 16, 2024 | Brain DecodingHallucination | CodeCode Available | 0 |
| Im2Avatar: Colorful 3D Reconstruction from a Single Image | Apr 17, 2018 | 3D ReconstructionHallucination | CodeCode Available | 0 |
| Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations | Jun 16, 2024 | HallucinationMisinformation | CodeCode Available | 0 |
| ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models | Mar 8, 2024 | AttributeHallucination | CodeCode Available | 0 |
| Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models | Dec 3, 2023 | HallucinationVisual Grounding | CodeCode Available | 0 |
| StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation | Jun 19, 2024 | HallucinationRetrieval | CodeCode Available | 0 |