SOTAVerified

Hallucination

Papers

Showing 17011750 of 1816 papers

TitleStatusHype
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language ModelsCode0
Automating Feedback Analysis in Surgical Training: Detection, Categorization, and AssessmentCode0
Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated TextsCode0
Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified RobustnessCode0
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their VulnerabilitiesCode0
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-TrainingCode0
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the WildCode0
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMsCode0
How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation?Code0
A Claim Decomposition Benchmark for Long-form Answer VerificationCode0
Entity-driven Fact-aware Abstractive Summarization of Biomedical LiteratureCode0
HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language ModelsCode0
Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language ModelsCode0
Projected Distribution Loss for Image EnhancementCode0
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion ModelsCode0
What's Wrong? Refining Meeting Summaries with LLM FeedbackCode0
ToW: Thoughts of Words Improve Reasoning in Large Language ModelsCode0
Prompt Injection Detection and Mitigation via AI Multi-Agent NLP FrameworksCode0
HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty DecodingCode0
Stress-Testing Multimodal Foundation Models for Crystallographic ReasoningCode0
Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch ReasoningCode0
ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMsCode0
A Unified Hallucination Mitigation Framework for Large Vision-Language ModelsCode0
HaRiM^+: Evaluating Summary Quality with Hallucination RiskCode0
Assessing the Reliability of Large Language Model KnowledgeCode0
Are Large Language Models Good at Utility Judgments?Code0
Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System ResponsesCode0
Pushing the Limits of Low-Resource Morphological InflectionCode0
Embedding Hallucination for Few-Shot Language Fine-tuningCode0
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree PropagationCode0
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language ModelsCode0
Verbosity Veracity: Demystify Verbosity Compensation Behavior of Large Language ModelsCode0
CiteBART: Learning to Generate Citations for Local Citation RecommendationCode0
Characterizing Multimodal Long-form Summarization: A Case Study on Financial ReportsCode0
Appraising the Potential Uses and Harms of LLMs for Medical Systematic ReviewsCode0
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal ReasoningCode0
Anticipation-Free Training for Simultaneous Machine TranslationCode0
Qwen Look Again: Guiding Vision-Language Reasoning Models to Re-attention Visual InformationCode0
Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to GiantCode0
Handwritten Code Recognition for Pen-and-Paper CS EducationCode0
Handling Ontology Gaps in Semantic ParsingCode0
Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement LearningCode0
Synthetic Imitation Edit Feedback for Factual Alignment in Clinical SummarizationCode0
Treble Counterfactual VLMs: A Causal Approach to HallucinationCode0
Effectively Enhancing Vision Language Large Models by Prompt Augmentation and Caption UtilizationCode0
HaluEval-Wild: Evaluating Hallucinations of Language Models in the WildCode0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationCode0
ELOQ: Resources for Enhancing LLM Detection of Out-of-Scope QuestionsCode0
Visually Dehallucinative Instruction GenerationCode0
Show:102550
← PrevPage 35 of 37Next →

No leaderboard results yet.