| Navigating Hallucinations for Reasoning of Unintentional Activities | Feb 29, 2024 | HallucinationNavigate | —Unverified | 0 |
| Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models | Feb 28, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| Collaborative decoding of critical tokens for boosting factuality of large language models | Feb 28, 2024 | HallucinationInstruction Following | —Unverified | 0 |
| Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScore | Feb 28, 2024 | DiversityForm | CodeCode Available | 0 |
| Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models | Feb 27, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses | Feb 27, 2024 | Hallucination | CodeCode Available | 0 |
| GROUNDHOG: Grounding Large Language Models to Holistic Segmentation | Feb 26, 2024 | Causal Language ModelingGeneralized Referring Expression Segmentation | —Unverified | 0 |
| Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models | Feb 26, 2024 | Decision MakingHallucination | —Unverified | 0 |
| AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation | Feb 25, 2024 | Face GenerationHallucination | —Unverified | 0 |
| HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs | Feb 25, 2024 | BenchmarkingChatbot | CodeCode Available | 0 |
| Rethinking Software Engineering in the Foundation Model Era: A Curated Catalogue of Challenges in the Development of Trustworthy FMware | Feb 25, 2024 | Hallucination | —Unverified | 0 |
| Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models | Feb 24, 2024 | HallucinationHallucination Evaluation | —Unverified | 0 |
| CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean | Feb 23, 2024 | ClassificationHallucination | —Unverified | 0 |
| UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models | Feb 22, 2024 | HallucinationRetrieval | CodeCode Available | 0 |
| Does the Generator Mind its Contexts? An Analysis of Generative Model Faithfulness under Context Transfer | Feb 22, 2024 | Generative Question AnsweringHallucination | —Unverified | 0 |
| DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models | Feb 22, 2024 | Hallucination | CodeCode Available | 0 |
| Science Checker Reloaded: A Bidirectional Paradigm for Transparency and Logical Reasoning | Feb 21, 2024 | HallucinationInformation Retrieval | CodeCode Available | 0 |
| OPDAI at SemEval-2024 Task 6: Small LLMs can Accelerate Hallucination Detection with Weakly Supervised Data | Feb 20, 2024 | Few-Shot LearningHallucination | —Unverified | 0 |
| Enhanced Hallucination Detection in Neural Machine Translation through Simple Detector Aggregation | Feb 20, 2024 | HallucinationMachine Translation | —Unverified | 0 |
| Emergence and dynamics of delusions and hallucinations across stages in early psychosis | Feb 20, 2024 | Hallucination | —Unverified | 0 |
| GOOD: Towards Domain Generalized Orientated Object Detection | Feb 20, 2024 | HallucinationObject | —Unverified | 0 |
| OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification | Feb 20, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Structured Chain-of-Thought Prompting for Few-Shot Generation of Content-Grounded QA Conversations | Feb 19, 2024 | HallucinationLanguage Modeling | —Unverified | 0 |
| Enabling Weak LLMs to Judge Response Reliability via Meta Ranking | Feb 19, 2024 | HallucinationIn-Context Learning | —Unverified | 0 |
| M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation | Feb 19, 2024 | counterfactualDialogue Generation | —Unverified | 0 |