SOTAVerified

Hallucination

Papers

Showing 13511375 of 1816 papers

TitleStatusHype
DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration0
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language ModelsCode0
Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics0
Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systemsCode0
TRAWL: External Knowledge-Enhanced Recommendation with LLM Assistance0
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos0
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds0
On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in SummarizationCode0
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach0
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language ModelsCode0
Can Large Language Models Play Games? A Case Study of A Self-Play Approach0
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation0
ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues0
Effectiveness Assessment of Recent Large Vision-Language Models0
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification0
HaluEval-Wild: Evaluating Hallucinations of Language Models in the WildCode0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word ProblemCode0
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth DatasetCode0
The Claude 3 Model Family: Opus, Sonnet, Haiku0
Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models0
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering0
Self-Consistent Decoding for More Factual Open ResponsesCode0
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection0
Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models0
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models0
Show:102550
← PrevPage 55 of 73Next →

No leaderboard results yet.