Hallucination

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 1816 papers

Title	Date	Tasks	Status	Hype
MiniCPM-V: A GPT-4V Level MLLM on Your Phone	Aug 3, 2024	HallucinationMultiple-choice	CodeCode Available	12
Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models	Mar 5, 2025	HallucinationInstruction Following	CodeCode Available	11
SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning	Aug 10, 2024	HallucinationOptical Character Recognition	CodeCode Available	11
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness	May 27, 2024	HallucinationImage Captioning	CodeCode Available	11
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?	Nov 25, 2024	HallucinationKnowledge Distillation	CodeCode Available	7
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models	Jan 29, 2024	HallucinationMixture-of-Experts	CodeCode Available	7
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback	Dec 1, 2023	HallucinationImage Captioning	CodeCode Available	6
Gorilla: Large Language Model Connected with Massive APIs	May 24, 2023	HallucinationLanguage Modeling	CodeCode Available	6
UQLM: A Python Package for Uncertainty Quantification in Large Language Models	Jul 8, 2025	HallucinationUncertainty Quantification	CodeCode Available	5
DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning	May 20, 2025	HallucinationMathematical Reasoning	CodeCode Available	5
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers	Apr 27, 2025	HallucinationQuestion Answering	CodeCode Available	5
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean	Apr 18, 2024	Automated Theorem ProvingHallucination	CodeCode Available	5
Weakly Supervised Detection of Hallucinations in LLM Activations	Dec 5, 2023	HallucinationLanguage Modeling	CodeCode Available	5
Ferret: Refer and Ground Anything Anywhere at Any Granularity	Oct 11, 2023	HallucinationLanguage Modeling	CodeCode Available	5
Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model	Jun 28, 2023	HallucinationKnowledge Graphs	CodeCode Available	5
LettuceDetect: A Hallucination Detection Framework for RAG Applications	Feb 24, 2025	8kGPU	CodeCode Available	4
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding	Jan 14, 2025	Embodied Question AnsweringHallucination	CodeCode Available	4
A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges	Jan 4, 2025	FairnessHallucination	CodeCode Available	4
Halu-J: Critique-Based Hallucination Judge	Jul 17, 2024	Evidence SelectionHallucination	CodeCode Available	4
Hallucination of Multimodal Large Language Models: A Survey	Apr 29, 2024	HallucinationSurvey	CodeCode Available	4
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World	Feb 29, 2024	AllHallucination	CodeCode Available	4
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models	Feb 12, 2024	HallucinationObject Localization	CodeCode Available	4
G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering	Feb 12, 2024	Common Sense ReasoningGraph Classification	CodeCode Available	4
LLM-Enhanced Data Management	Feb 4, 2024	HallucinationManagement	CodeCode Available	4
Retrieval-Augmented Generation for Large Language Models: A Survey	Dec 18, 2023	HallucinationRAG	CodeCode Available	4
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling	Nov 1, 2023	HallucinationKnowledge Distillation	CodeCode Available	4
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese	Sep 8, 2023	Domain AdaptationHallucination	CodeCode Available	4
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models	Jul 30, 2023	HallucinationPrompt Engineering	CodeCode Available	4
Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration	Jul 11, 2023	HallucinationLogic Grid Puzzle	CodeCode Available	4
Multimodal Chain-of-Thought Reasoning in Language Models	Feb 2, 2023	HallucinationLanguage Modelling	CodeCode Available	4
ReAct: Synergizing Reasoning and Acting in Language Models	Oct 6, 2022	Decision MakingFact Verification	CodeCode Available	4
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models	May 22, 2025	BenchmarkingFairness	CodeCode Available	3
Verdict: A Library for Scaling Judge-Time Compute	Feb 25, 2025	Fact CheckingHallucination	CodeCode Available	3
Automated Hypothesis Validation with Agentic Sequential Falsifications	Feb 14, 2025	Decision MakingHallucination	CodeCode Available	3
VideoRoPE: What Makes for Good Video Rotary Position Embedding?	Feb 7, 2025	HallucinationPosition	CodeCode Available	3
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion	Dec 5, 2024	Contrastive LearningHallucination	CodeCode Available	3
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent	Nov 5, 2024	BenchmarkingHallucination	CodeCode Available	3
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems	Nov 5, 2024	HallucinationRAG	CodeCode Available	3
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models	Oct 16, 2024	DiagnosticHallucination	CodeCode Available	3
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio	Oct 16, 2024	Hallucination	CodeCode Available	3
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models	Oct 16, 2024	HallucinationKnowledge Graphs	CodeCode Available	3
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making	Oct 9, 2024	BenchmarkingDecision Making	CodeCode Available	3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Aug 28, 2024	Computational EfficiencyHallucination	CodeCode Available	3
Graph Retrieval-Augmented Generation: A Survey	Aug 15, 2024	HallucinationRAG	CodeCode Available	3
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework	Aug 2, 2024	BenchmarkingDataset Generation	CodeCode Available	3
Learning Dynamics of LLM Finetuning	Jul 15, 2024	Hallucination	CodeCode Available	3
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models	Jun 16, 2024	HallucinationHallucination Evaluation	CodeCode Available	3
CRAG -- Comprehensive RAG Benchmark	Jun 7, 2024	HallucinationLanguage Modelling	CodeCode Available	3
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models	May 23, 2024	HallucinationSentence	CodeCode Available	3
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing	Apr 30, 2024	Computational EfficiencyHallucination	CodeCode Available	3

Show:10 25 50

← PrevPage 1 of 37Next →

No leaderboard results yet.