| H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models | Nov 6, 2024 | HallucinationObject | —Unverified | 0 |
| Fine-Tuning Vision-Language Model for Automated Engineering Drawing Information Extraction | Nov 6, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation | Nov 6, 2024 | HallucinationRAG | —Unverified | 0 |
| Automated, LLM enabled extraction of synthesis details for reticular materials from scientific literature | Nov 5, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| VERITAS: A Unified Approach to Reliability Evaluation | Nov 5, 2024 | Fact CheckingHallucination | —Unverified | 0 |
| Leveraging Vision-Language Models for Manufacturing Feature Recognition in CAD Designs | Nov 5, 2024 | Few-Shot LearningHallucination | —Unverified | 0 |
| V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization | Nov 5, 2024 | HallucinationLanguage Modeling | CodeCode Available | 2 |
| Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent | Nov 5, 2024 | BenchmarkingHallucination | CodeCode Available | 3 |
| DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation Benchmark | Nov 5, 2024 | Data AugmentationHallucination | CodeCode Available | 0 |
| HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems | Nov 5, 2024 | HallucinationRAG | CodeCode Available | 3 |
| CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality | Nov 4, 2024 | HallucinationLighting Estimation | —Unverified | 0 |
| Robust plug-and-play methods for highly accelerated non-Cartesian MRI reconstruction | Nov 4, 2024 | compressed sensingDenoising | —Unverified | 0 |
| Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models | Nov 4, 2024 | Experimental DesignHallucination | —Unverified | 0 |
| Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models | Nov 3, 2024 | HallucinationInstruction Following | CodeCode Available | 0 |
| Towards Multi-Source Retrieval-Augmented Generation via Synergizing Reasoning and Preference-Driven Retrieval | Nov 1, 2024 | HallucinationRAG | —Unverified | 0 |
| RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models | Nov 1, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers | Oct 31, 2024 | Hallucination | —Unverified | 0 |
| Exploring the Knowledge Mismatch Hypothesis: Hallucination Propensity in Small Models Fine-tuned on Data from Larger Models | Oct 31, 2024 | HallucinationMisinformation | —Unverified | 0 |
| EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection | Oct 30, 2024 | Continual LearningHallucination | —Unverified | 0 |
| Beyond Ontology in Dialogue State Tracking for Goal-Oriented Chatbot | Oct 30, 2024 | ChatbotDialogue State Tracking | CodeCode Available | 0 |
| VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning | Oct 30, 2024 | BenchmarkingHallucination | —Unverified | 0 |
| Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models | Oct 30, 2024 | HallucinationHallucination Evaluation | CodeCode Available | 0 |
| Distinguishing Ignorance from Error in LLM Hallucinations | Oct 29, 2024 | HallucinationQuestion Answering | CodeCode Available | 1 |
| MARCO: Multi-Agent Real-time Chat Orchestration | Oct 29, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation | Oct 29, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |