SOTAVerified

Multiple-choice

Papers

Showing 4150 of 1107 papers

TitleStatusHype
HourVideo: 1-Hour Video-Language UnderstandingCode2
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language ModelsCode2
Patho-R1: A Multimodal Reinforcement Learning-Based Pathology Expert ReasonerCode2
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language ModelsCode2
Biomedical knowledge graph-optimized prompt generation for large language modelsCode2
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow ThinkingCode2
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language ModelsCode2
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement LearningCode2
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational BiologyCode2
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1Code2
Show:102550
← PrevPage 5 of 111Next →

No leaderboard results yet.