| The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG) | May 21, 2024 | HallucinationRAG | CodeCode Available | 1 |
| Retrieval-Augmented Language Model for Extreme Multi-Label Knowledge Graph Link Prediction | May 21, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models | May 20, 2024 | BenchmarkingDiversity | —Unverified | 0 |
| Automated Multi-level Preference for MLLMs | May 18, 2024 | Dataset GenerationHallucination | CodeCode Available | 1 |
| Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model | May 16, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling | May 16, 2024 | Contrastive LearningHallucination | CodeCode Available | 1 |
| Spurious reconstruction from brain activity | May 16, 2024 | Brain DecodingHallucination | CodeCode Available | 0 |
| A Comprehensive Survey of Hallucination in Large Language, Image, Video and Audio Foundation Models | May 15, 2024 | Hallucination | —Unverified | 0 |
| Word Alignment as Preference for Machine Translation | May 15, 2024 | HallucinationLanguage Modelling | —Unverified | 0 |
| Navigating LLM Ethics: Advancements, Challenges, and Future Directions | May 14, 2024 | EthicsFairness | —Unverified | 0 |
| ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation | May 14, 2024 | Hallucinationscientific discovery | —Unverified | 0 |
| Control Token with Dense Passage Retrieval | May 13, 2024 | HallucinationPassage Retrieval | —Unverified | 0 |
| Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness | May 13, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval | May 10, 2024 | HallucinationKnowledge Graphs | —Unverified | 0 |
| LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought | May 9, 2024 | HallucinationMath | —Unverified | 0 |
| THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models | May 8, 2024 | AttributeData Augmentation | CodeCode Available | 1 |
| Is the House Ready For Sleeptime? Generating and Evaluating Situational Queries for Embodied Question Answering | May 8, 2024 | 2kEmbodied Question Answering | —Unverified | 0 |
| SUTRA: Scalable Multilingual Language Model Architecture | May 7, 2024 | Computational EfficiencyHallucination | —Unverified | 0 |
| Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models | May 7, 2024 | HallucinationKnowledge Graphs | CodeCode Available | 0 |
| Deception in Reinforced Autonomous Agents | May 7, 2024 | Deception DetectionHallucination | —Unverified | 0 |
| Quantifying the Capabilities of LLMs across Scale and Precision | May 6, 2024 | HallucinationMisinformation | —Unverified | 0 |
| Score-based Generative Priors Guided Model-driven Network for MRI Reconstruction | May 5, 2024 | DenoisingHallucination | —Unverified | 0 |
| R4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models | May 4, 2024 | Graph AttentionHallucination | —Unverified | 0 |
| Attribution in Scientific Literature: New Benchmark and Methods | May 3, 2024 | Author AttributionHallucination | —Unverified | 0 |
| FLAME: Factuality-Aware Alignment for Large Language Models | May 2, 2024 | HallucinationInstruction Following | —Unverified | 0 |