SOTAVerified

Multiple-choice

Papers

Showing 821830 of 1107 papers

TitleStatusHype
The Impact of Item-Writing Flaws on Difficulty and Discrimination in Item Response Theory0
A Novel Approach for Constrained Optimization in Graphical Models0
BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles0
The Lazy Student's Dream: ChatGPT Passing an Engineering Course on Its Own0
BLINK: Multimodal Large Language Models Can See but Not Perceive0
An MRC Framework for Semantic Role Labeling0
BloomVQA: Assessing Hierarchical Multi-modal Comprehension0
The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs0
The Role of Large Language Models in Musicology: Are We Ready to Trust the Machines?0
Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs0
Show:102550
← PrevPage 83 of 111Next →

No leaderboard results yet.