SOTAVerified

Multiple-choice

Papers

Showing 881890 of 1107 papers

TitleStatusHype
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison FeedbackCode0
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language ModelsCode0
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language LearningCode0
Automating Turkish Educational Quiz Generation Using Large Language ModelsCode0
How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making?Code0
Measuring Agreeableness Bias in Multimodal ModelsCode0
CSEPrompts: A Benchmark of Introductory Computer Science PromptsCode0
MedArabiQ: Benchmarking Large Language Models on Arabic Medical TasksCode0
MedG-KRP: Medical Graph Knowledge Representation ProbingCode0
How much do LLMs learn from negative examples?Code0
Show:102550
← PrevPage 89 of 111Next →

No leaderboard results yet.