SOTAVerified

Medical Visual Question Answering

Papers

Showing 150 of 97 papers

TitleStatusHype
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLMCode4
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsCode4
Flamingo: a Visual Language Model for Few-Shot LearningCode4
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for MedicineCode3
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical ModalitiesCode2
MedPromptX: Grounded Multimodal Prompting for Chest X-ray DiagnosisCode2
PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical ImagingCode2
Med-Flamingo: a Multimodal Medical Few-shot LearnerCode2
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical TasksCode2
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical DocumentsCode2
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical TasksCode1
MedCoT: Medical Chain of Thought via Hierarchical ExpertCode1
A Survey of Medical Vision-and-Language Applications and Their TechniquesCode1
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM IntegrationCode1
MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation modelsCode1
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic SurgeryCode1
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-AnsweringCode1
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQACode1
LaPA: Latent Prompt Assist Model For Medical Visual Question AnsweringCode1
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & HallucinationsCode1
MISS: A Generative Pretraining and Finetuning Approach for Med-VQACode1
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray ImagesCode1
Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question AnsweringCode1
Rad-ReStruct: A Novel VQA Benchmark and Method for Structured Radiology ReportingCode1
Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question AnsweringCode1
Localized Questions in Medical Visual Question AnsweringCode1
MedBLIP: Bootstrapping Language-Image Pre-training from 3D Medical Images and TextsCode1
PMC-VQA: Visual Instruction Tuning for Medical Visual Question AnsweringCode1
Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal PretrainingCode1
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language ModelsCode1
BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairsCode1
Self-supervised vision-language pretraining for Medical visual question answeringCode1
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-TrainingCode1
Multiple Meta-model Quantifying for Medical Visual Question AnsweringCode1
SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question AnsweringCode1
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and ReportsCode1
PathVQA: 30000+ Questions for Medical Visual Question AnsweringCode1
Overcoming Data Limitation in Medical Visual Question AnsweringCode1
Barriers in Integrating Medical Visual Question Answering into Radiology Workflows: A Scoping Review and Clinicians' Insights0
SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning0
GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning0
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making0
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal EndoscopyCode0
MedOrch: Medical Diagnosis with Tool-Augmented Reasoning Agents for Flexible Extensibility0
A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering0
Toward Effective Reinforcement Learning Fine-Tuning for Medical VQA in Vision-Language Models0
Structure Causal Models and LLMs Integration in Medical Visual Question Answering0
Bridging the Semantic Gaps: Improving Medical VQA Consistency with LLM-Augmented Question Sets0
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion0
Vision-Amplified Semantic Entropy for Hallucination Detection in Medical Visual Question Answering0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.