SOTAVerified

Multiple-choice

Papers

Showing 101110 of 1107 papers

TitleStatusHype
ArabicMMLU: Assessing Massive Multitask Language Understanding in ArabicCode1
General-Purpose Question-Answering with MacawCode1
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question AnsweringCode1
Bridging Video-text Retrieval with Multiple Choice QuestionsCode1
BRAINTEASER: Lateral Thinking Puzzles for Large Language ModelsCode1
GPT Takes the Bar ExamCode1
FaceXBench: Evaluating Multimodal LLMs on Face UnderstandingCode1
Boosting Healthcare LLMs Through Retrieved ContextCode1
Fake Alignment: Are LLMs Really Aligned Well?Code1
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense ReasoningCode1
Show:102550
← PrevPage 11 of 111Next →

No leaderboard results yet.