| DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis? | May 30, 2025 | DiagnosticMedical Image Analysis | CodeCode Available | 1 |
| CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs | May 30, 2025 | DiagnosticImage Comprehension | —Unverified | 0 |
| Searching Clinical Data Using Generative AI | May 30, 2025 | Diagnostic | —Unverified | 0 |
| Reasoning Can Hurt the Inductive Abilities of Large Language Models | May 30, 2025 | Diagnostic | —Unverified | 0 |
| FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model Evaluation | May 30, 2025 | DiagnosticLanguage Model Evaluation | CodeCode Available | 0 |
| LLaMA-XR: A Novel Framework for Radiology Report Generation using LLaMA and QLoRA Fine Tuning | May 29, 2025 | Computational EfficiencyDiagnostic | —Unverified | 0 |
| Comparative analysis of privacy-preserving open-source LLMs regarding extraction of diagnostic information from clinical CMR imaging reports | May 29, 2025 | DescriptiveDiagnostic | —Unverified | 0 |
| SafeCOMM: What about Safety Alignment in Fine-Tuned Telecom Large Language Models? | May 29, 2025 | DiagnosticRed Teaming | —Unverified | 0 |
| Multi-output Classification using a Cross-talk Architecture for Compound Fault Diagnosis of Motors in Partially Labeled Condition | May 29, 2025 | DiagnosticDomain Adaptation | —Unverified | 0 |
| Infi-Med: Low-Resource Medical MLLMs with Robust Reasoning Evaluation | May 29, 2025 | DiagnosticMultimodal Reasoning | —Unverified | 0 |