| Valuable Hallucinations: Realizable Non-realistic Propositions | Feb 16, 2025 | Hallucination | —Unverified | 0 |
| A Survey of LLM-based Agents in Medicine: How far are we from Baymax? | Feb 16, 2025 | HallucinationSurvey | —Unverified | 0 |
| Automated Hypothesis Validation with Agentic Sequential Falsifications | Feb 14, 2025 | Decision MakingHallucination | CodeCode Available | 3 |
| Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables | Feb 13, 2025 | Active LearningHallucination | —Unverified | 0 |
| DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities | Feb 11, 2025 | HallucinationSSIM | —Unverified | 0 |
| Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning | Feb 11, 2025 | HallucinationIn-Context Learning | CodeCode Available | 0 |
| Hallucination, Monofacts, and Miscalibration: An Empirical Investigation | Feb 11, 2025 | DecoderHallucination | CodeCode Available | 0 |
| Refine Knowledge of Large Language Models via Adaptive Contrastive Learning | Feb 11, 2025 | Contrastive LearningHallucination | —Unverified | 0 |
| Hallucination Detection: A Probabilistic Framework Using Embeddings Distance Analysis | Feb 10, 2025 | Hallucination | —Unverified | 0 |
| Knowledge Graph-Guided Retrieval Augmented Generation | Feb 8, 2025 | DiversityHallucination | CodeCode Available | 2 |
| Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models | Feb 8, 2025 | Conformal PredictionDecision Making | CodeCode Available | 0 |
| Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks | Feb 7, 2025 | Abstractive Text SummarizationExplanation Generation | CodeCode Available | 0 |
| VideoRoPE: What Makes for Good Video Rotary Position Embedding? | Feb 7, 2025 | HallucinationPosition | CodeCode Available | 3 |
| ChallengeMe: An Adversarial Learning-enabled Text Summarization Framework | Feb 7, 2025 | HallucinationSpecificity | —Unverified | 0 |
| Linear Correlation in LM's Compositional Generalization and Hallucination | Feb 6, 2025 | Hallucination | CodeCode Available | 0 |
| TruthFlow: Truthful LLM Generation via Representation Flow Correction | Feb 6, 2025 | HallucinationTruthfulQA | —Unverified | 0 |
| Large Language Models for Multi-Robot Systems: A Survey | Feb 6, 2025 | Action GenerationBenchmarking | CodeCode Available | 1 |
| Enhancing Hallucination Detection through Noise Injection | Feb 6, 2025 | Hallucination | —Unverified | 0 |
| The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering | Feb 5, 2025 | Hallucination | CodeCode Available | 2 |
| A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) | Feb 5, 2025 | HallucinationSpatial Reasoning | —Unverified | 0 |
| DAMO: Data- and Model-aware Alignment of Multi-modal LLMs | Feb 4, 2025 | Hallucination | CodeCode Available | 1 |
| Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration | Feb 4, 2025 | AttributeHallucination | —Unverified | 0 |
| Eliciting Language Model Behaviors with Investigator Agents | Feb 3, 2025 | Bayesian InferenceHallucination | —Unverified | 0 |
| SelfCheckAgent: Zero-Resource Hallucination Detection in Generative Large Language Models | Feb 3, 2025 | Hallucination | —Unverified | 0 |
| MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation | Feb 3, 2025 | BenchmarkingFairness | —Unverified | 0 |