| Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions | Aug 1, 2024 | Medical Question AnsweringMedQA | CodeCode Available | 4 | 5 |
| Benchmarking Retrieval-Augmented Generation for Medicine | Feb 20, 2024 | BenchmarkingInformation Retrieval | CodeCode Available | 4 | 5 |
| Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models | Apr 16, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains | Feb 15, 2024 | Few-Shot LearningMedical Question Answering | CodeCode Available | 2 | 5 |
| Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models | Jan 27, 2024 | Medical Question AnsweringMultiple-choice | CodeCode Available | 2 | 5 |
| MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning | Mar 10, 2025 | BenchmarkingMedical Question Answering | CodeCode Available | 2 | 5 |
| AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator | Feb 15, 2024 | BenchmarkingDiagnostic | CodeCode Available | 2 | 5 |
| GreaseLM: Graph REASoning Enhanced Language Models for Question Answering | Jan 21, 2022 | Knowledge GraphsMedical Question Answering | CodeCode Available | 2 | 5 |
| Huatuo-26M, a Large-scale Chinese Medical QA Dataset | May 2, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| PMC-LLaMA: Towards Building Open-source Language Models for Medicine | Apr 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |