| TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models | Jun 7, 2024 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| Towards Generalist Biomedical AI | Jul 26, 2023 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| Towards Reliable Medical Question Answering: Techniques and Challenges in Mitigating Hallucinations in Language Models | Aug 25, 2024 | Decision MakingHallucination | —Unverified | 0 |
| Two-Layer Retrieval-Augmented Generation Framework for Low-Resource Medical Question Answering Using Reddit Data: Proof-of-Concept Study | May 29, 2024 | Answer GenerationHallucination | —Unverified | 0 |
| Uncertainty Estimation of Large Language Models in Medical Question Answering | Jul 11, 2024 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| Unifying Corroborative and Contributive Attributions in Large Language Models | Nov 20, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction | Apr 22, 2024 | DiversityLanguage Modeling | —Unverified | 0 |
| What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs | May 15, 2025 | AllBenchmarking | —Unverified | 0 |
| What Would it Take to get Biomedical QA Systems into Practice? | Sep 21, 2021 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| LLM-MedQA: Enhancing Medical Question Answering through Case Studies in Large Language Models | Dec 31, 2024 | Medical Question AnsweringMedQA | —Unverified | 0 |