SOTAVerified

Medical Visual Question Answering

Papers

Showing 150 of 97 papers

TitleStatusHype
Barriers in Integrating Medical Visual Question Answering into Radiology Workflows: A Scoping Review and Clinicians' Insights0
SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning0
GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning0
CAPO: Reinforcing Consistent Reasoning in Medical Decision-Making0
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal EndoscopyCode0
MedOrch: Medical Diagnosis with Tool-Augmented Reasoning Agents for Flexible Extensibility0
A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering0
Toward Effective Reinforcement Learning Fine-Tuning for Medical VQA in Vision-Language Models0
MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical TasksCode1
Structure Causal Models and LLMs Integration in Medical Visual Question Answering0
Bridging the Semantic Gaps: Improving Medical VQA Consistency with LLM-Augmented Question Sets0
Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion0
Vision-Amplified Semantic Entropy for Hallucination Detection in Medical Visual Question Answering0
DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels0
ClinKD: Cross-Modal Clinical Knowledge Distiller For Multi-Task Medical ImagesCode0
Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering0
MedCoT: Medical Chain of Thought via Hierarchical ExpertCode1
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical ModalitiesCode2
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis0
Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model0
A Survey of Medical Vision-and-Language Applications and Their TechniquesCode1
Efficient Bilinear Attention-based Fusion for Medical Visual Question Answering0
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest0
Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering0
LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound0
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM IntegrationCode1
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue0
MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation modelsCode1
Kvasir-VQA: A Text-Image Pair GI Tract DatasetCode0
FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge InjectionCode0
Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next ParadigmCode0
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic SurgeryCode1
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for MedicineCode3
Targeted Visual Prompting for Medical Visual Question AnsweringCode0
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering0
TM-PATHVQA:90000+ Textless Multilingual Questions for Medical Visual Question Answering0
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-AnsweringCode1
Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis0
Detecting and Evaluating Medical Hallucinations in Large Vision Language Models0
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQACode1
Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models0
Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering0
Grounded Knowledge-Enhanced Medical VLP for Chest X-Ray0
WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models0
LaPA: Latent Prompt Assist Model For Medical Visual Question AnsweringCode1
MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale0
MedPromptX: Grounded Multimodal Prompting for Chest X-ray DiagnosisCode2
Enhancing Generalization in Medical Visual Question Answering Tasks via Gradient-Guided Model Perturbation0
Prompt-based Personalized Federated Learning for Medical Visual Question Answering0
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLMCode4
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.