SOTAVerified

Multiple-choice

Papers

Showing 421430 of 1107 papers

TitleStatusHype
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination ReasoningCode0
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit ScalesCode0
Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language ModelsCode0
Investigating the Shortcomings of LLMs in Step-by-Step Legal ReasoningCode0
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?Code0
Introducing a framework to assess newly created questions with Natural Language ProcessingCode0
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language ModelsCode0
Self-Recognition in Language ModelsCode0
LLaVA-OneVision: Easy Visual Task TransferCode0
Improving Question Answering with External KnowledgeCode0
Show:102550
← PrevPage 43 of 111Next →

No leaderboard results yet.