SOTAVerified

Multiple-choice

Papers

Showing 1120 of 1107 papers

TitleStatusHype
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBenchCode4
Video-LLaVA: Learning United Visual Representation by Alignment Before ProjectionCode4
VideoEval-Pro: Robust and Realistic Long Video Understanding EvaluationCode4
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice PerspectiveCode4
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual ComprehensionCode3
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language ModelsCode3
SEED-Bench: Benchmarking Multimodal Large Language ModelsCode3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video UnderstandingCode3
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation DatasetCode3
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation ModelsCode3
Show:102550
← PrevPage 2 of 111Next →

No leaderboard results yet.