SOTAVerified

Hallucination

Papers

Showing 151175 of 1816 papers

TitleStatusHype
LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language ModelsCode2
Enabling Large Language Models to Generate Text with CitationsCode2
Lawyer LLaMA Technical ReportCode2
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language ModelsCode2
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine TranslationCode2
Evaluating Object Hallucination in Large Vision-Language ModelsCode2
Exploring Human-Like Translation Strategy with Large Language ModelsCode2
GPT-NER: Named Entity Recognition via Large Language ModelsCode2
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language ModelsCode2
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in SummarizationCode2
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step QuestionsCode2
Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects SuppressionCode2
PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervisionCode2
Mitigating Object Hallucinations via Sentence-Level Early InterventionCode1
KnowRL: Exploring Knowledgeable Reinforcement Learning for FactualityCode1
DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion ModelsCode1
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMsCode1
Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMsCode1
MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language ModelsCode1
Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning ModelsCode1
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal VerificationCode1
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data SynthesisCode1
FlySearch: Exploring how vision-language models exploreCode1
The Hallucination Dilemma: Factuality-Aware Reinforcement Learning for Large Reasoning ModelsCode1
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language ModelsCode1
Show:102550
← PrevPage 7 of 73Next →

No leaderboard results yet.