SOTAVerified

Hallucination

Papers

Showing 13511400 of 1816 papers

TitleStatusHype
DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration0
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language ModelsCode0
Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics0
Investigating the performance of Retrieval-Augmented Generation and fine-tuning for the development of AI-driven knowledge-based systemsCode0
TRAWL: External Knowledge-Enhanced Recommendation with LLM Assistance0
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos0
Guiding Clinical Reasoning with Large Language Models via Knowledge Seeds0
On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in SummarizationCode0
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach0
ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language ModelsCode0
Can Large Language Models Play Games? A Case Study of A Self-Play Approach0
Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation0
ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues0
Effectiveness Assessment of Recent Large Vision-Language Models0
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification0
HaluEval-Wild: Evaluating Hallucinations of Language Models in the WildCode0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word ProblemCode0
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth DatasetCode0
The Claude 3 Model Family: Opus, Sonnet, Haiku0
Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models0
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering0
Self-Consistent Decoding for More Factual Open ResponsesCode0
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection0
Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models0
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models0
Navigating Hallucinations for Reasoning of Unintentional Activities0
Editing Factual Knowledge and Explanatory Ability of Medical Large Language ModelsCode0
Collaborative decoding of critical tokens for boosting factuality of large language models0
Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScoreCode0
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models0
Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM ResponsesCode0
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation0
Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models0
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation0
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMsCode0
Rethinking Software Engineering in the Foundation Model Era: A Curated Catalogue of Challenges in the Development of Trustworthy FMware0
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models0
CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean0
UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language ModelsCode0
Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer0
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language ModelsCode0
Science Checker Reloaded: A Bidirectional Paradigm for Transparency and Logical ReasoningCode0
OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination Detection with Weakly Supervised Data0
Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation0
Emergence and dynamics of delusions and hallucinations across stages in early psychosis0
GOOD: Towards Domain Generalized Orientated Object Detection0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification0
Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations0
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking0
M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation0
Show:102550
← PrevPage 28 of 37Next →

No leaderboard results yet.