Hallucination

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1151–1200 of 1816 papers

Title	Date	Tasks	Status	Hype
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification	Mar 7, 2024	Fact CheckingHallucination	—Unverified	0
Effectiveness Assessment of Recent Large Vision-Language Models	Mar 7, 2024	Anomaly DetectionAttribute	—Unverified	0
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Mar 6, 2024	BenchmarkingHallucination	CodeCode Available	0
German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset	Mar 6, 2024	HallucinationIn-Context Learning	CodeCode Available	0
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents	Mar 5, 2024	HallucinationSelf-Learning	CodeCode Available	3
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers	Mar 5, 2024	Hallucination	CodeCode Available	1
The Claude 3 Model Family: Opus, Sonnet, Haiku	Mar 4, 2024	1 Image, 2*2 StitchingArithmetic Reasoning	—Unverified	0
Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering	Mar 3, 2024	Claim VerificationGraph Question Answering	—Unverified	0
Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models	Mar 3, 2024	Hallucination	—Unverified	0
CR-LT-KGQA: A Knowledge Graph Question Answering Dataset Requiring Commonsense Reasoning and Long-Tail Knowledge	Mar 3, 2024	Claim VerificationGraph Question Answering	CodeCode Available	1
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation	Mar 3, 2024	HallucinationTruthfulQA	CodeCode Available	2
MALTO at SemEval-2024 Task 6: Leveraging Synthetic Data for LLM Hallucination Detection	Mar 1, 2024	Data AugmentationHallucination	—Unverified	0
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models	Mar 1, 2024	HallucinationHallucination Evaluation	CodeCode Available	1
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding	Mar 1, 2024	HallucinationObject	CodeCode Available	2
Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models	Mar 1, 2024	HallucinationRetrieval	—Unverified	0
Self-Consistent Decoding for More Factual Open Responses	Mar 1, 2024	HallucinationResponse Generation	CodeCode Available	0
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models	Feb 29, 2024	Hallucination	—Unverified	0
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World	Feb 29, 2024	AllHallucination	CodeCode Available	4
Navigating Hallucinations for Reasoning of Unintentional Activities	Feb 29, 2024	HallucinationNavigate	—Unverified	0
Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScore	Feb 28, 2024	DiversityForm	CodeCode Available	0
Collaborative decoding of critical tokens for boosting factuality of large language models	Feb 28, 2024	HallucinationInstruction Following	—Unverified	0
All in an Aggregated Image for In-Image Learning	Feb 28, 2024	AllHallucination	CodeCode Available	1
Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models	Feb 28, 2024	BenchmarkingHallucination	CodeCode Available	0
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models	Feb 27, 2024	HallucinationIn-Context Learning	—Unverified	0
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space	Feb 27, 2024	Contrastive LearningHallucination	CodeCode Available	2
Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses	Feb 27, 2024	Hallucination	CodeCode Available	0
Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models	Feb 26, 2024	Decision MakingHallucination	—Unverified	0
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation	Feb 26, 2024	Causal Language ModelingGeneralized Referring Expression Segmentation	—Unverified	0
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs	Feb 25, 2024	BenchmarkingChatbot	CodeCode Available	0
Rethinking Software Engineering in the Foundation Model Era: A Curated Catalogue of Challenges in the Development of Trustworthy FMware	Feb 25, 2024	Hallucination	—Unverified	0
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation	Feb 25, 2024	Face GenerationHallucination	—Unverified	0
Detecting Machine-Generated Texts by Multi-Population Aware Optimization for Maximum Mean Discrepancy	Feb 25, 2024	HallucinationSentence	CodeCode Available	1
Citation-Enhanced Generation for LLM-based Chatbots	Feb 25, 2024	ChatbotCitation Prediction	CodeCode Available	1
Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models	Feb 24, 2024	HallucinationHallucination Evaluation	—Unverified	0
A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models	Feb 23, 2024	Hallucination	CodeCode Available	1
CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean	Feb 23, 2024	ClassificationHallucination	—Unverified	0
Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding	Feb 23, 2024	HallucinationObject	CodeCode Available	1
UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models	Feb 22, 2024	HallucinationRetrieval	CodeCode Available	0
DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models	Feb 22, 2024	Hallucination	CodeCode Available	0
Visual Hallucinations of Multi-modal Large Language Models	Feb 22, 2024	DiversityHallucination	CodeCode Available	1
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective	Feb 22, 2024	HallucinationSentence	CodeCode Available	2
Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer	Feb 22, 2024	Generative Question AnsweringHallucination	—Unverified	0
Science Checker Reloaded: A Bidirectional Paradigm for Transparency and Logical Reasoning	Feb 21, 2024	HallucinationInformation Retrieval	CodeCode Available	0
Emergence and dynamics of delusions and hallucinations across stages in early psychosis	Feb 20, 2024	Hallucination	—Unverified	0
Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation	Feb 20, 2024	HallucinationMachine Translation	—Unverified	0
OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination Detection with Weakly Supervised Data	Feb 20, 2024	Few-Shot LearningHallucination	—Unverified	0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification	Feb 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
GOOD: Towards Domain Generalized Orientated Object Detection	Feb 20, 2024	HallucinationObject	—Unverified	0
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization	Feb 20, 2024	HallucinationNews Summarization	CodeCode Available	1
Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations	Feb 19, 2024	HallucinationLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 24 of 37Next →

No leaderboard results yet.