Hallucination

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 1816 papers

Title	Date	Tasks	Status	Hype	Score
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation	Mar 8, 2024	Code GenerationHallucination	CodeCode Available	3	5
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models	Feb 2, 2024	Action GenerationDecision Making	CodeCode Available	3	5
Verdict: A Library for Scaling Judge-Time Compute	Feb 25, 2025	Fact CheckingHallucination	CodeCode Available	3	5
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent	Nov 5, 2024	BenchmarkingHallucination	CodeCode Available	3	5
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework	Aug 2, 2024	BenchmarkingDataset Generation	CodeCode Available	3	5
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models	May 22, 2025	BenchmarkingFairness	CodeCode Available	3	5
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems	Nov 5, 2024	HallucinationRAG	CodeCode Available	3	5
When Large Language Models Meet Vector Databases: A Survey	Jan 30, 2024	HallucinationInformation Retrieval	CodeCode Available	3	5
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making	Oct 9, 2024	BenchmarkingDecision Making	CodeCode Available	3	5
Automated Hypothesis Validation with Agentic Sequential Falsifications	Feb 14, 2025	Decision MakingHallucination	CodeCode Available	3	5
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models	Oct 16, 2024	DiagnosticHallucination	CodeCode Available	3	5
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Aug 28, 2024	Computational EfficiencyHallucination	CodeCode Available	3	5
CRAG -- Comprehensive RAG Benchmark	Jun 7, 2024	HallucinationLanguage Modelling	CodeCode Available	3	5
Learning Dynamics of LLM Finetuning	Jul 15, 2024	Hallucination	CodeCode Available	3	5
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models	Mar 19, 2024	Hallucination	CodeCode Available	3	5
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models	May 23, 2024	HallucinationSentence	CodeCode Available	3	5
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment	Feb 13, 2024	Hallucination	CodeCode Available	2	5
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation	Mar 3, 2024	HallucinationTruthfulQA	CodeCode Available	2	5
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions	Dec 20, 2022	HallucinationQuestion Answering	CodeCode Available	2	5
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models	Aug 4, 2024	Hallucination	CodeCode Available	2	5
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs	Jan 28, 2025	Hallucination	CodeCode Available	2	5
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models	May 19, 2023	HallucinationHallucination Evaluation	CodeCode Available	2	5
A Diffusion-Based Generative Equalizer for Music Restoration	Mar 27, 2024	Bandwidth ExtensionHallucination	CodeCode Available	2	5
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies	Aug 6, 2023	Hallucination	CodeCode Available	2	5
Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions	Jun 11, 2024	HallucinationImage Description	CodeCode Available	2	5

Show:10 25 50

← PrevPage 3 of 73Next →

No leaderboard results yet.