| Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework | Mar 7, 2025 | Conformal PredictionMedical Question Answering | —Unverified | 0 |
| Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support | Feb 25, 2025 | Decision MakingDiagnostic | CodeCode Available | 2 |
| AutoMedPrompt: A New Framework for Optimizing LLM Medical Prompts Using Textual Gradients | Feb 21, 2025 | MedQAPrompt Engineering | —Unverified | 0 |
| Agentic Medical Knowledge Graphs Enhance Medical Question Answering: Bridging the Gap Between LLMs and Evolving Medical Knowledge | Feb 18, 2025 | Graph GenerationKnowledge Graphs | —Unverified | 0 |
| OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning | Feb 16, 2025 | MedQAMMLU | —Unverified | 0 |
| CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering | Jan 30, 2025 | General KnowledgeLanguage Modeling | —Unverified | 0 |
| MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding | Jan 30, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning | Jan 11, 2025 | Decision MakingDiagnostic | CodeCode Available | 1 |
| LLM-MedQA: Enhancing Medical Question Answering through Case Studies in Large Language Models | Dec 31, 2024 | Medical Question AnsweringMedQA | —Unverified | 0 |
| AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset | Nov 23, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |