SOTAVerified

Multiple-choice

Papers

Showing 3140 of 1107 papers

TitleStatusHype
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow ThinkingCode2
MMLU-CF: A Contamination-free Multi-task Language Understanding BenchmarkCode2
Neptune: The Long Orbit to Benchmarking Long Video UnderstandingCode2
StoryTeller: Improving Long Video Description through Global Audio-Visual Character IdentificationCode2
HourVideo: 1-Hour Video-Language UnderstandingCode2
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal ModelsCode2
Towards Evaluating and Building Versatile Large Language Models for MedicineCode2
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language ModelsCode2
XMainframe: A Large Language Model for Mainframe ModernizationCode2
Show:102550
← PrevPage 4 of 111Next →

No leaderboard results yet.