SOTAVerified

Multiple-choice

Papers

Showing 9911000 of 1107 papers

TitleStatusHype
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense ScenariosCode0
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question AnsweringCode0
Order-Independence Without Fine TuningCode0
Towards Diverse Perspective Learning with Selection over Multiple Temporal PoolingsCode0
PROST: Physical Reasoning of Objects through Space and TimeCode0
VEGAS: Towards Visually Explainable and Grounded Artificial Social IntelligenceCode0
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot SettingCode0
This Land is Your, My Land: Evaluating Geopolitical Biases in Language ModelsCode0
Evaluating the Instruction-following Abilities of Language Models using Knowledge TasksCode0
Multi-class Hierarchical Question Classification for Multiple Choice Science ExamsCode0
Show:102550
← PrevPage 100 of 111Next →

No leaderboard results yet.