SOTAVerified

Hallucination

Papers

Showing 576600 of 1816 papers

TitleStatusHype
H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models0
Fine-Tuning Vision-Language Model for Automated Engineering Drawing Information Extraction0
Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation0
Automated, LLM enabled extraction of synthesis details for reticular materials from scientific literature0
VERITAS: A Unified Approach to Reliability Evaluation0
Leveraging Vision-Language Models for Manufacturing Feature Recognition in CAD Designs0
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference OptimizationCode2
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning AgentCode3
DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation BenchmarkCode0
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG SystemsCode3
CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality0
Robust plug-and-play methods for highly accelerated non-Cartesian MRI reconstruction0
Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models0
Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language ModelsCode0
Towards Multi-Source Retrieval-Augmented Generation via Synergizing Reasoning and Preference-Driven Retrieval0
RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models0
Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers0
Exploring the Knowledge Mismatch Hypothesis: Hallucination Propensity in Small Models Fine-tuned on Data from Larger Models0
EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection0
Beyond Ontology in Dialogue State Tracking for Goal-Oriented ChatbotCode0
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning0
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language ModelsCode0
Distinguishing Ignorance from Error in LLM HallucinationsCode1
MARCO: Multi-Agent Real-time Chat Orchestration0
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation0
Show:102550
← PrevPage 24 of 73Next →

No leaderboard results yet.