SOTAVerified

Multiple-choice

Papers

Showing 6170 of 1107 papers

TitleStatusHype
Towards Evaluating and Building Versatile Large Language Models for MedicineCode2
Understanding Long Videos with Multimodal Language ModelsCode2
HourVideo: 1-Hour Video-Language UnderstandingCode2
Biomedical knowledge graph-optimized prompt generation for large language modelsCode2
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1Code2
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLMCode2
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language ModelsCode2
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language ModelsCode2
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question AnsweringCode2
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
Show:102550
← PrevPage 7 of 111Next →

No leaderboard results yet.