SOTAVerified

Hallucination

Papers

Showing 17011725 of 1816 papers

TitleStatusHype
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language ModelsCode0
Automating Feedback Analysis in Surgical Training: Detection, Categorization, and AssessmentCode0
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated TextsCode0
Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified RobustnessCode0
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their VulnerabilitiesCode0
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-TrainingCode0
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the WildCode0
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMsCode0
How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation?Code0
A Claim Decomposition Benchmark for Long-form Answer VerificationCode0
Entity-driven Fact-aware Abstractive Summarization of Biomedical LiteratureCode0
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language ModelsCode0
Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language ModelsCode0
Projected Distribution Loss for Image EnhancementCode0
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion ModelsCode0
What's Wrong? Refining Meeting Summaries with LLM FeedbackCode0
ToW: Thoughts of Words Improve Reasoning in Large Language ModelsCode0
Prompt Injection Detection and Mitigation via AI Multi-Agent NLP FrameworksCode0
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty DecodingCode0
Stress-Testing Multimodal Foundation Models for Crystallographic ReasoningCode0
Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch ReasoningCode0
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMsCode0
A Unified Hallucination Mitigation Framework for Large Vision-Language ModelsCode0
HaRiM^+: Evaluating Summary Quality with Hallucination RiskCode0
Show:102550
← PrevPage 69 of 73Next →

No leaderboard results yet.