Hallucination

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 1816 papers

Title	Date	Tasks	Status	Hype
VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models	Jun 24, 2024	HallucinationVideo Understanding	—Unverified	0
Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models	Jun 24, 2024	Common Sense ReasoningHallucination	CodeCode Available	1
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models	Jun 24, 2024	HallucinationImage Generation	CodeCode Available	0
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs	Jun 22, 2024	HallucinationUncertainty Quantification	CodeCode Available	2
Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework	Jun 20, 2024	HallucinationQuestion Answering	CodeCode Available	2
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?	Jun 20, 2024	Caption GenerationHallucination	—Unverified	0
From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment	Jun 20, 2024	DescriptiveHallucination	—Unverified	0
HIGHT: Hierarchical Graph Tokenization for Molecule-Language Alignment	Jun 20, 2024	Graph Neural NetworkHallucination	—Unverified	0
Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination	Jun 20, 2024	Hallucination	—Unverified	0
Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases	Jun 19, 2024	8kHallucination	CodeCode Available	2
Knowledge Graph-Enhanced Large Language Models via Path Selection	Jun 19, 2024	HallucinationKnowledge Graphs	CodeCode Available	1
StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation	Jun 19, 2024	HallucinationRetrieval	CodeCode Available	0
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual Errors	Jun 18, 2024	HallucinationLanguage Modeling	CodeCode Available	0
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding	Jun 18, 2024	Hallucination	CodeCode Available	1
RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation	Jun 18, 2024	HallucinationRAG	—Unverified	0
What Matters in Memorizing and Recalling Facts? Multifaceted Benchmarks for Knowledge Probing in Language Models	Jun 18, 2024	DecoderHallucination	—Unverified	0
On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation	Jun 18, 2024	HallucinationResponse Generation	CodeCode Available	0
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?	Jun 18, 2024	AttributeHallucination	—Unverified	0
Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models	Jun 18, 2024	Hallucination	—Unverified	0
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States	Jun 17, 2024	BenchmarkingContrastive Learning	—Unverified	0
Self-training Large Language Models through Knowledge Detection	Jun 17, 2024	HallucinationLanguage Modeling	CodeCode Available	0
Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector	Jun 17, 2024	2kHallucination	CodeCode Available	1
Mitigating Large Language Model Hallucination with Faithful Finetuning	Jun 17, 2024	HallucinationLanguage Modeling	—Unverified	0
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs	Jun 17, 2024	counterfactualHallucination	CodeCode Available	0
Hallucination Mitigation Prompts Long-term Video Understanding	Jun 17, 2024	Answer GenerationHallucination	CodeCode Available	0
CoMT: Chain-of-Medical-Thought Reduces Hallucination in Medical Report Generation	Jun 17, 2024	DiagnosticHallucination	—Unverified	0
MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts	Jun 17, 2024	HallucinationMixture-of-Experts	CodeCode Available	1
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models	Jun 17, 2024	Benchmarking	CodeCode Available	2
mDPO: Conditional Preference Optimization for Multimodal Large Language Models	Jun 17, 2024	HallucinationLanguage Modeling	CodeCode Available	2
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals	Jun 16, 2024	Hallucination	—Unverified	0
Post-hoc Utterance Refining Method by Entity Mining for Faithful Knowledge Grounded Conversations	Jun 16, 2024	HallucinationMisinformation	CodeCode Available	0
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models	Jun 16, 2024	HallucinationHallucination Evaluation	CodeCode Available	3
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models	Jun 14, 2024	HallucinationMedical Visual Question Answering	—Unverified	0
MMRel: A Relation Understanding Benchmark in the MLLM Era	Jun 13, 2024	DiversityHallucination	CodeCode Available	1
Understanding Hallucinations in Diffusion Models through Mode Interpolation	Jun 13, 2024	HallucinationImage Generation	CodeCode Available	2
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	0
We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs	Jun 12, 2024	Code GenerationHallucination	CodeCode Available	1
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models	Jun 12, 2024	Audio captioningHallucination	CodeCode Available	2
Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis	Jun 11, 2024	HallucinationLanguage Modelling	—Unverified	0
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy	Jun 11, 2024	DiversityHallucination	CodeCode Available	1
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions	Jun 11, 2024	HallucinationImage Description	CodeCode Available	2
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation	Jun 11, 2024	Hallucination	CodeCode Available	0
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation	Jun 11, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
On the Hallucination in Simultaneous Machine Translation	Jun 11, 2024	HallucinationMachine Translation	CodeCode Available	0
Progressive Query Expansion for Retrieval Over Cost-constrained Data Sources	Jun 11, 2024	HallucinationRetrieval	—Unverified	0
Estimating the Hallucination Rate of Generative AI	Jun 11, 2024	HallucinationIn-Context Learning	—Unverified	0
DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation	Jun 9, 2024	Common Sense ReasoningDenoising	CodeCode Available	1
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation	Jun 8, 2024	Abstractive Text SummarizationDialogue Generation	—Unverified	0
CRAG -- Comprehensive RAG Benchmark	Jun 7, 2024	HallucinationLanguage Modelling	CodeCode Available	3
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models	Jun 7, 2024	Hallucinationparameter-efficient fine-tuning	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 37Next →

No leaderboard results yet.