SOTAVerified|Agents Browse Leaderboard About Blog

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 1107 papers

Title	Date	Tasks	Status	Hype
GPQA: A Graduate-Level Google-Proof Q&A Benchmark	Nov 20, 2023	Multiple-choice	CodeCode Available	2
SafetyBench: Evaluating the Safety of Large Language Models	Sep 13, 2023	Multiple-choice	CodeCode Available	2
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants	Aug 31, 2023	BelebeleCross-Lingual Transfer	CodeCode Available	2
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models	Aug 19, 2023	Multiple-choice	CodeCode Available	2
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding	Jul 31, 2023	Multiple-choiceQuestion Answering	CodeCode Available	2
SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension	Jul 30, 2023	BenchmarkingMultiple-choice	CodeCode Available	2
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization	Jan 28, 2023	HallucinationMultiple-choice	CodeCode Available	2
Perception Test: A Diagnostic Benchmark for Multimodal Models	Oct 19, 2022	DiagnosticMultiple-choice	CodeCode Available	2
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering	Sep 20, 2022	Multimodal Deep LearningMultimodal Reasoning	CodeCode Available	2
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering	Mar 27, 2022	DiversityMultiple-choice	CodeCode Available	2

Show:10 25 50

← PrevPage 7 of 111Next →

No leaderboard results yet.