SOTAVerified

Hallucination

Papers

Showing 901950 of 1816 papers

TitleStatusHype
VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty EstimationCode0
Enabling Explainable Recommendation in E-commerce with LLM-powered Product Knowledge Graph0
Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question AnsweringCode0
INVARLLM: LLM-assisted Physical Invariant Extraction for Cyber-Physical Systems Anomaly Detection0
Chain-of-Programming (CoP) : Empowering Large Language Models for Geospatial Code Generation0
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models0
A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery0
Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs0
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization0
Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity0
LLM Hallucination Reasoning with Zero-shot Knowledge Test0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse0
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted CaptionsCode0
Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified RobustnessCode0
Verbosity Veracity: Demystify Verbosity Compensation Behavior of Large Language ModelsCode0
Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders0
DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False PremisesCode0
SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents0
Evaluating the Accuracy of Chatbots in Financial Literature0
Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation0
Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques0
Mitigating Hallucination with ZeroG: An Advanced Knowledge Management Engine0
Seeing Through the Fog: A Cost-Effectiveness Analysis of Hallucination Detection Systems0
LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation Combining Hierarchical Agents and RAG0
AMSnet-KG: A Netlist Dataset for LLM-based AMS Circuit Auto-Design Using Knowledge Graph RAG0
Prompt-Guided Internal States for Hallucination Detection of Large Language Models0
H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models0
Fine-Tuning Vision-Language Model for Automated Engineering Drawing Information Extraction0
Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation0
DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation BenchmarkCode0
VERITAS: A Unified Approach to Reliability Evaluation0
Automated, LLM enabled extraction of synthesis details for reticular materials from scientific literature0
Leveraging Vision-Language Models for Manufacturing Feature Recognition in CAD Designs0
Robust plug-and-play methods for highly accelerated non-Cartesian MRI reconstruction0
CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality0
Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models0
Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language ModelsCode0
Towards Multi-Source Retrieval-Augmented Generation via Synergizing Reasoning and Preference-Driven Retrieval0
RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models0
Exploring the Knowledge Mismatch Hypothesis: Hallucination Propensity in Small Models Fine-tuned on Data from Larger Models0
Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers0
VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning0
EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection0
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language ModelsCode0
Beyond Ontology in Dialogue State Tracking for Goal-Oriented ChatbotCode0
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation0
MARCO: Multi-Agent Real-time Chat Orchestration0
A Perspective for Adapting Generalist AI to Specialized Medical AI Applications and Their Challenges0
A Debate-Driven Experiment on LLM Hallucinations and Accuracy0
Show:102550
← PrevPage 19 of 37Next →

No leaderboard results yet.