| HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs | Feb 25, 2024 | BenchmarkingChatbot | CodeCode Available | 0 |
| A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | Jan 2, 2024 | Financial AnalysisHallucination | CodeCode Available | 0 |
| Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment | Dec 1, 2024 | Action DetectionActivity Detection | CodeCode Available | 0 |
| Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts | Sep 25, 2024 | Hallucination | CodeCode Available | 0 |
| Confidence-aware Denoised Fine-tuning of Off-the-shelf Models for Certified Robustness | Nov 13, 2024 | Adversarial RobustnessDenoising | CodeCode Available | 0 |
| How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities | Nov 15, 2023 | EthicsFairness | CodeCode Available | 0 |
| Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training | May 13, 2025 | HallucinationLarge Language Model | CodeCode Available | 0 |
| How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild | Feb 18, 2025 | ArticlesHallucination | CodeCode Available | 0 |
| Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs | Jun 11, 2025 | Dependency ParsingHallucination | CodeCode Available | 0 |
| How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation? | Aug 1, 2021 | Domain AdaptationHallucination | CodeCode Available | 0 |