SOTAVerified|Agents Browse Leaderboard About Blog

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 1107 papers

Title	Date	Tasks	Status	Hype
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench	Jan 31, 2024	BenchmarkingMultiple-choice	CodeCode Available	4
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection	Nov 16, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation	May 20, 2025	MMEMultiple-choice	CodeCode Available	4
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective	Oct 16, 2022	Coreference ResolutionMultiple-choice	CodeCode Available	4
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension	Apr 25, 2024	BenchmarkingMultiple-choice	CodeCode Available	3
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models	Mar 26, 2024	Code CompletionFew-Shot Learning	CodeCode Available	3
SEED-Bench: Benchmarking Multimodal Large Language Models	Jan 1, 2024	BenchmarkingImage Generation	CodeCode Available	3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding	Apr 8, 2024	GPUMultiple-choice	CodeCode Available	3
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	May 17, 2024	16kBenchmarking	CodeCode Available	3
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models	May 15, 2023	Multiple-choice	CodeCode Available	3

Show:10 25 50

← PrevPage 2 of 111Next →

No leaderboard results yet.