SOTAVerified

Multiple-choice

Papers

Showing 671680 of 1107 papers

TitleStatusHype
Automating question generation from educational text0
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems0
Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language ModelsCode0
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model EvaluationCode1
Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change0
Language models are susceptible to incorrect patient self-diagnosis in medical applications0
Self-Assessment Tests are Unreliable Measures of LLM Personality0
SafetyBench: Evaluating the Safety of Large Language ModelsCode2
Performance of ChatGPT-3.5 and GPT-4 on the United States Medical Licensing Examination With and Without Distractions0
Use neural networks to recognize students' handwritten letters and incorrect symbols0
Show:102550
← PrevPage 68 of 111Next →

No leaderboard results yet.