SOTAVerified|Agents Browse Leaderboard About Blog

Hallucination

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–50 of 1816 papers

Title	Date	Tasks	Status	Hype
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation	Mar 8, 2024	Code GenerationHallucination	CodeCode Available	3
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making	Oct 9, 2024	BenchmarkingDecision Making	CodeCode Available	3
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models	Oct 16, 2024	DiagnosticHallucination	CodeCode Available	3
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models	Feb 12, 2024	Answer GenerationHallucination	CodeCode Available	3
Learning Dynamics of LLM Finetuning	Jul 15, 2024	Hallucination	CodeCode Available	3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Aug 28, 2024	Computational EfficiencyHallucination	CodeCode Available	3
CRAG -- Comprehensive RAG Benchmark	Jun 7, 2024	HallucinationLanguage Modelling	CodeCode Available	3
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models	Feb 2, 2024	Action GenerationDecision Making	CodeCode Available	3
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models	May 22, 2025	BenchmarkingFairness	CodeCode Available	3
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent	Nov 5, 2024	BenchmarkingHallucination	CodeCode Available	3

Show:10 25 50

← PrevPage 5 of 182Next →

No leaderboard results yet.