| A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports | Sep 3, 2020 | Image-text RetrievalMedical Visual Question Answering | CodeCode Available | 1 |
| MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical Images and Texts | May 18, 2023 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| MedCoT: Medical Chain of Thought via Hierarchical Expert | Dec 18, 2024 | DiagnosticMedical Visual Question Answering | CodeCode Available | 1 |
| A Survey of Medical Vision-and-Language Applications and Their Techniques | Nov 19, 2024 | Decision MakingDiagnostic | CodeCode Available | 1 |
| Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering | Jul 11, 2023 | Language ModelingMedical Visual Question Answering | CodeCode Available | 1 |
| MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration | Oct 6, 2024 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations | Feb 10, 2024 | DiagnosticHallucination | CodeCode Available | 1 |
| LaPA: Latent Prompt Assist Model For Medical Visual Question Answering | Apr 19, 2024 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| Localized Questions in Medical Visual Question Answering | Jul 3, 2023 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 1 |
| EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images | Oct 28, 2023 | Decision MakingMedical Visual Question Answering | CodeCode Available | 1 |