SOTAVerified

Hallucination

Papers

Showing 401450 of 1816 papers

TitleStatusHype
Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph CompletionCode1
Chain of Natural Language Inference for Reducing Large Language Model Ungrounded HallucinationsCode1
Finding and Editing Multi-Modal Neurons in Pre-Trained TransformersCode1
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous SourcesCode1
A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM OutputsCode1
FineSurE: Fine-grained Summarization Evaluation using LLMsCode1
Deficiency-Aware Masked Transformer for Video InpaintingCode1
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & HallucinationsCode1
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check ConsistencyCode1
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal VerificationCode1
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction DataCode1
Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature AugmentationCode1
FaithDial: A Faithful Benchmark for Information-Seeking DialogueCode1
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMsCode1
Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic PapersCode1
Analyzing and Mitigating Object Hallucination in Large Vision-Language ModelsCode1
Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for HallucinationsCode1
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI FeedbackCode1
Detecting and Preventing Hallucinations in Large Vision Language ModelsCode1
Benchmarking LLM Faithfulness in RAG with Evolving LeaderboardsCode1
FactAlign: Long-form Factuality Alignment of Large Language ModelsCode1
FAIR GPT: A virtual consultant for research data management in ChatGPTCode1
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative DecodingCode1
Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic SegmentationCode1
Exploring the Transferability of Visual Prompting for Multimodal Large Language ModelsCode1
Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and BeyondCode1
Extract Free Dense Misalignment from CLIPCode1
AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language GenerationCode1
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference OptimizationCode1
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-ExpertsCode1
Face Hallucination via Split-Attention in Split-Attention NetworkCode1
Evaluation and Analysis of Hallucination in Large Vision-Language ModelsCode1
The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?Code1
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language ModelsCode1
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language ModelsCode1
Theory of Mind for Multi-Agent Collaboration via Large Language ModelsCode1
EventHallusion: Diagnosing Event Hallucinations in Video LLMsCode1
DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion ModelsCode1
Doc2Query--: When Less is MoreCode1
EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale DatasetCode1
Evaluating Image Hallucination in Text-to-Image Generation with Question-AnsweringCode1
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented GenerationCode1
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and MitigationCode1
Distinguishing Ignorance from Error in LLM HallucinationsCode1
Federated Recommendation via Hybrid Retrieval Augmented GenerationCode1
Hallucinated Neural Radiance Fields in the WildCode1
Label Hallucination for Few-Shot ClassificationCode1
PREFER: Prompt Ensemble Learning via Feedback-Reflect-RefineCode1
Trustworthiness in Retrieval-Augmented Generation Systems: A SurveyCode1
Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation0
Show:102550
← PrevPage 9 of 37Next →

No leaderboard results yet.