SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–660 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
Predicting the Difficulty of Multiple Choice Questions in a High-stakes Medical Exam	Aug 1, 2019	Multiple-choiceQuestion Answering	—Unverified	0	0
Predictions from language models for multiple-choice tasks are not robust under variation of scoring methods	Mar 1, 2024	Multiple-choice	—Unverified	0	0
Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability	Nov 10, 2024	Multiple-choiceText Generation	—Unverified	0	0
Prompt Engineering and Calibration for Zero-Shot Commonsense Reasoning	Apr 14, 2023	Multiple-choicePrompt Engineering	—Unverified	0	0
Prompting Implicit Discourse Relation Annotation	Feb 7, 2024	ClassificationImplicit Discourse Relation Classification	—Unverified	0	0
Instruction Fine-Tuning: Does Prompt Loss Matter?	Jan 24, 2024	Multiple-choicetoken-classification	—Unverified	0	0
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding	Nov 7, 2024	BenchmarkingMultiple-choice	—Unverified	0	0
ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology	Nov 16, 2023	MMLUMultiple-choice	—Unverified	0	0
PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities	Jan 13, 2024	Instruction FollowingMultiple-choice	—Unverified	0	0
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs	Sep 30, 2024	BenchmarkingMultiple-choice	—Unverified	0	0

Show:10 25 50

← PrevPage 66 of 111Next →

No leaderboard results yet.