| Disentangling Reasoning and Knowledge in Medical Large Language Models | May 16, 2025 | DiagnosticMedQA | —Unverified | 0 | 0 |
| Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge | Feb 18, 2025 | Graph GenerationKnowledge Graphs | —Unverified | 0 | 0 |
| A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage | Apr 28, 2025 | MedQA | —Unverified | 0 | 0 |
| AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset | Nov 23, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments | May 13, 2024 | Decision MakingDiagnostic | —Unverified | 0 | 0 |
| Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents | May 5, 2024 | MedQAQuestion Answering | —Unverified | 0 | 0 |
| A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? | Sep 23, 2024 | HallucinationMedQA | —Unverified | 0 | 0 |
| Assessing The Potential Of Mid-Sized Language Models For Clinical QA | Apr 24, 2024 | MedQAQuestion Answering | —Unverified | 0 | 0 |
| AutoMedPrompt: A New Framework for Optimizing LLM Medical Prompts Using Textual Gradients | Feb 21, 2025 | MedQAPrompt Engineering | —Unverified | 0 | 0 |
| Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content | Jun 25, 2025 | ArticlesContinual Pretraining | —Unverified | 0 | 0 |
| CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering | Jan 30, 2025 | General KnowledgeLanguage Modeling | —Unverified | 0 | 0 |
| Capabilities of Gemini Models in Medicine | Apr 29, 2024 | In-Context LearningMedQA | —Unverified | 0 | 0 |
| Challenges of GPT-3-based Conversational Agents for Healthcare | Aug 28, 2023 | Medical Question AnsweringMedQA | —Unverified | 0 | 0 |
| CliniChat: A Multi-Source Knowledge-Driven Framework for Clinical Interview Dialogue Reconstruction and Evaluation | Apr 14, 2025 | MedQA | —Unverified | 0 | 0 |
| DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents | Mar 30, 2023 | Conversation SummarizationLanguage Modeling | —Unverified | 0 | 0 |
| DiversityMedQA: Assessing Demographic Biases in Medical Diagnosis using Large Language Models | Sep 2, 2024 | Medical DiagnosisMedQA | —Unverified | 0 | 0 |
| DoPAMine: Domain-specific Pre-training Adaptation from seed-guided data Mining | Sep 30, 2024 | Continual PretrainingDomain Adaptation | —Unverified | 0 | 0 |
| Eir: Thai Medical Large Language Models | Sep 13, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation | Jun 7, 2025 | MedQAQuantization | —Unverified | 0 | 0 |
| Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems | Mar 19, 2025 | counterfactualDecision Making | —Unverified | 0 | 0 |
| Evaluation of the phi-3-mini SLM for identification of texts related to medicine, health, and sports injuries | Mar 31, 2025 | 4kMedQA | —Unverified | 0 | 0 |
| Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training | Jun 18, 2025 | MedQAMMLU | —Unverified | 0 | 0 |
| Generating multiple-choice questions for medical question answering with distractors and cue-masking | Mar 13, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering | Mar 22, 2023 | Common Sense ReasoningKnowledge Graphs | —Unverified | 0 | 0 |
| GreaseLM: Graph REASoning Enhanced Language Models | Sep 29, 2021 | Knowledge GraphsMedical Question Answering | —Unverified | 0 | 0 |