SOTAVerified

Multiple-choice

Papers

Showing 641650 of 1107 papers

TitleStatusHype
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language ModelsCode1
Fake Alignment: Are LLMs Really Aligned Well?Code1
Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks0
Assessing Distractors in Multiple-Choice Tests0
Evaluating multiple large language models in pediatric ophthalmology0
Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions0
More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems0
CASE: Commonsense-Augmented Score with an Expanded Answer SpaceCode0
Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysisCode1
An Open Source Data Contamination Report for Large Language ModelsCode1
Show:102550
← PrevPage 65 of 111Next →

No leaderboard results yet.