SOTAVerified

Multiple-choice

Papers

Showing 861870 of 1107 papers

TitleStatusHype
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models0
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain0
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments0
Collaboration among Multiple Large Language Models for Medical Question Answering0
Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses0
Combinatorial framework for planning in geological exploration0
Combining Multiple Cues for Visual Madlibs Question Answering0
Comparative Study of Learning Outcomes for Online Learning Platforms0
Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding0
Confidence-Aware Learning Assistant0
Show:102550
← PrevPage 87 of 111Next →

No leaderboard results yet.