SOTAVerified

Hallucination

Papers

Showing 101150 of 1816 papers

TitleStatusHype
Chain-of-Thought Poisoning Attacks against R1-based Retrieval-Augmented Generation Systems0
Mitigating Hallucinations in Vision-Language Models through Image-Guided Head SuppressionCode1
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language ModelsCode3
Walk&Retrieve: Simple Yet Effective Zero-shot Retrieval-Augmented Generation via Knowledge Graph WalksCode0
NEXT-EVAL: Next Evaluation of Traditional and LLM Web Data Record Extraction0
Aug2Search: Enhancing Facebook Marketplace Search with LLM-Generated Synthetic Data Augmentation0
OViP: Online Vision-Language Preference Learning0
Hallucinate at the Last in Long Response Generation: A Case Study on Long Document Summarization0
Multilingual Prompting for Improving LLM Generation Diversity0
KaFT: Knowledge-aware Fine-tuning for Boosting LLMs' Domain-specific Question-Answering Performance0
RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection0
HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving0
Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization0
Foundations of Unknown-aware Machine Learning0
Multimodal RAG-driven Anomaly Detection and Classification in Laser Powder Bed Fusion using Large Language Models0
Visual Instruction Bottleneck Tuning0
JARVIS: A Multi-Agent Code Assistant for High-Quality EDA Script Generation0
Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey0
Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis0
The Hallucination Tax of Reinforcement Finetuning0
Towards Omnidirectional Reasoning with 360-R1: A Dataset, Benchmark, and GRPO-based Method0
Aligning Attention Distribution to Information Flow for Hallucination Mitigation in Large Vision-Language Models0
Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language ModelsCode0
DeepEyes: Incentivizing "Thinking with Images" via Reinforcement LearningCode5
Legal Rule Induction: Towards Generalizable Principle Discovery from Analogous Judicial Precedents0
MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM HallucinationsCode0
Know Or Not: a library for evaluating out-of-knowledge base robustnessCode1
Selective Code Generation for Functional Guarantees0
Granary: Speech Recognition and Translation Dataset in 25 European Languages0
LLM-based Query Expansion Fails for Unfamiliar and Ambiguous QueriesCode0
Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down0
Mitigating Hallucination in VideoLLMs via Temporal-Aware Activation Engineering0
Detection and Mitigation of Hallucination in Large Reasoning Models: A Mechanistic Perspective0
Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical Practice0
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation0
Mitigating Hallucinations via Inter-Layer Consistency Aggregation in Large Vision-Language Models0
The Tower of Babel Revisited: Multilingual Jailbreak Prompts on Closed-Source Large Language Models0
Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language ModelsCode0
CCNU at SemEval-2025 Task 3: Leveraging Internal and External Knowledge of Large Language Models for Multilingual Hallucination Annotation0
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?0
Towards Robust Evaluation of STEM Education: Leveraging MLLMs in Project-Based Learning0
Diverging Towards Hallucination: Detection of Failures in Vision-Language Models via Multi-token Aggregation0
EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language ModelsCode0
Phare: A Safety Probe for Large Language ModelsCode1
Finetune-RAG: Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented GenerationCode1
DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented GenerationCode0
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges0
Beyond the Black Box: Interpretability of LLMs in Finance0
The Impact of Large Language Models on Task Automation in Manufacturing Services0
A Multimodal Multi-Agent Framework for Radiology Report Generation0
Show:102550
← PrevPage 3 of 37Next →

No leaderboard results yet.