| FiTs: Fine-grained Two-stage Training for Knowledge-aware Question Answering | Feb 23, 2023 | Knowledge GraphsMedical Question Answering | CodeCode Available | 1 |
| Relation-Aware Language-Graph Transformer for Question Answering | Dec 2, 2022 | Medical Question AnsweringMedQA | CodeCode Available | 1 |
| LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs | Apr 20, 2022 | Conversational Question AnsweringDialogue Generation | CodeCode Available | 1 |
| Kformer: Knowledge Injection in Transformer Feed-Forward Layers | Jan 15, 2022 | Language ModellingMedical Question Answering | CodeCode Available | 1 |
| A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question Understanding | Aug 1, 2021 | Data AugmentationDecoder | CodeCode Available | 1 |
| Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference | Dec 16, 2020 | Feature EngineeringMedical Question Answering | CodeCode Available | 1 |
| Question-Driven Summarization of Answers to Consumer Health Questions | May 18, 2020 | Medical Question AnsweringQuestion Answering | CodeCode Available | 1 |
| From RAG to Agentic: Validating Islamic-Medicine Responses with LLM Agents | Jun 18, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs | Jun 13, 2025 | Medical Question AnsweringMedQA | —Unverified | 0 |
| MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models | Jun 12, 2025 | Image SegmentationMedical Diagnosis | —Unverified | 0 |
| Med-REFL: Medical Reasoning Enhancement via Self-Corrected Fine-grained Reflection | Jun 11, 2025 | Medical Question AnsweringMedQA | CodeCode Available | 0 |
| ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases | May 30, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 |
| Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs | May 30, 2025 | Fact CheckingHallucination | —Unverified | 0 |
| MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering | May 29, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room | May 28, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare | May 26, 2025 | BenchmarkingMedical Diagnosis | CodeCode Available | 0 |
| Task Specific Pruning with LLM-Sieve: How Many Parameters Does Your Task Really Need? | May 23, 2025 | Medical Question AnsweringQuantization | —Unverified | 0 |
| Collaboration among Multiple Large Language Models for Medical Question Answering | May 22, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 |
| Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model | May 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs | May 15, 2025 | AllBenchmarking | —Unverified | 0 |
| Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI | May 11, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding | Apr 30, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QA | Apr 30, 2025 | Information RetrievalMedical Question Answering | CodeCode Available | 0 |
| Exploring the Role of Knowledge Graph-Based RAG in Japanese Medical Question Answering with Small-Scale LLMs | Apr 15, 2025 | Medical Question AnsweringQuestion Answering | —Unverified | 0 |
| PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization | Apr 10, 2025 | Anomaly DetectionBilevel Optimization | —Unverified | 0 |