SOTAVerified

Hallucination

Papers

Showing 251275 of 1816 papers

TitleStatusHype
Automated Review Generation Method Based on Large Language ModelsCode1
Enhancing LLM's Cognition via StructurizationCode1
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal ReasoningCode1
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive TasksCode1
Multi-Object Hallucination in Vision-Language ModelsCode1
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?Code1
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical ContextCode1
FineSurE: Fine-grained Summarization Evaluation using LLMsCode1
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language ModelsCode1
GraphArena: Benchmarking Large Language Models on Graph Computational ProblemsCode1
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language ModelsCode1
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language ModelsCode1
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language ModelsCode1
Knowledge Graph-Enhanced Large Language Models via Path SelectionCode1
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative DecodingCode1
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination DetectorCode1
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-ExpertsCode1
MMRel: A Relation Understanding Benchmark in the MLLM EraCode1
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMsCode1
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic EntropyCode1
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented GenerationCode1
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language ModelsCode1
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial TrainingCode1
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language ModelsCode1
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference OptimizationCode1
Show:102550
← PrevPage 11 of 73Next →

No leaderboard results yet.