SOTAVerified

Medical Question Answering

Papers

Showing 150 of 139 papers

TitleStatusHype
Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up QuestionsCode4
Benchmarking Retrieval-Augmented Generation for MedicineCode4
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical ReasoningCode2
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical ReasoningCode2
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsCode2
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical DomainsCode2
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction SimulatorCode2
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language ModelsCode2
Huatuo-26M, a Large-scale Chinese Medical QA DatasetCode2
PMC-LLaMA: Towards Building Open-source Language Models for MedicineCode2
GreaseLM: Graph REASoning Enhanced Language Models for Question AnsweringCode2
Walk the Talk? Measuring the Faithfulness of Large Language Model ExplanationsCode1
Mitigating Unintended Memorization with LoRA in Federated Learning for LLMsCode1
Rationale-Guided Retrieval Augmented Generation for Medical Question AnsweringCode1
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resourcesCode1
DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language ModelsCode1
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical Question-AnsweringCode1
Large language model validity via enhanced conformal prediction methodsCode1
KnowTuning: Knowledge-aware Fine-tuning for Large Language ModelsCode1
MedLM: Exploring Language Models for Medical Question Answering SystemsCode1
JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuningCode1
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language ModelCode1
Integrating UMLS Knowledge into Large Language Models for Medical Question AnsweringCode1
Towards Expert-Level Medical Question Answering with Large Language ModelsCode1
Benchmarking large language models for biomedical natural language processing applications and recommendationsCode1
FiTs: Fine-grained Two-stage Training for Knowledge-aware Question AnsweringCode1
Relation-Aware Language-Graph Transformer for Question AnsweringCode1
LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge GraphsCode1
Kformer: Knowledge Injection in Transformer Feed-Forward LayersCode1
A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question UnderstandingCode1
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global InferenceCode1
Question-Driven Summarization of Answers to Consumer Health QuestionsCode1
From RAG to Agentic: Validating Islamic-Medicine Responses with LLM Agents0
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs0
MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models0
Med-REFL: Medical Reasoning Enhancement via Self-Corrected Fine-grained ReflectionCode0
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases0
Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs0
MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering0
ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room0
AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and HealthcareCode0
Task Specific Pruning with LLM-Sieve: How Many Parameters Does Your Task Really Need?0
Collaboration among Multiple Large Language Models for Medical Question Answering0
Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language ModelCode0
What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs0
Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI0
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding0
Talk Before You Retrieve: Agent-Led Discussions for Better RAG in Medical QACode0
Exploring the Role of Knowledge Graph-Based RAG in Japanese Medical Question Answering with Small-Scale LLMs0
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.