| An Empirical Evaluation of Large Language Models on Consumer Health Questions | Dec 31, 2024 | Medical Question AnsweringQuestion Answering | —Unverified | 0 | 0 |
| ANU-CSIRO at MEDIQA 2019: Question Answering Using Deep Contextual Knowledge | Aug 1, 2019 | Medical Question AnsweringNatural Language Inference | —Unverified | 0 | 0 |
| ARS\_NITK at MEDIQA 2019:Analysing Various Methods for Natural Language Inference, Recognising Question Entailment and Medical Question Answering System | Aug 1, 2019 | Information RetrievalMedical Question Answering | —Unverified | 0 | 0 |
| A Survey for Large Language Models in Biomedicine | Aug 29, 2024 | DiagnosticDrug Discovery | —Unverified | 0 | 0 |
| Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI | May 11, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 | 0 |
| Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding | Apr 30, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 | 0 |
| Challenges of GPT-3-based Conversational Agents for Healthcare | Aug 28, 2023 | Medical Question AnsweringMedQA | —Unverified | 0 | 0 |
| ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases | May 30, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 | 0 |
| Collaboration among Multiple Large Language Models for Medical Question Answering | May 22, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 | 0 |
| Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering | Nov 14, 2024 | Medical Question AnsweringMisinformation | —Unverified | 0 | 0 |