SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 421–430 of 1107 papers

Title	Date	Tasks	Status	Hype
Changing Answer Order Can Decrease MMLU Accuracy	Jun 27, 2024	MMLUMultiple-choice	—Unverified	0
Length Optimization in Conformal Prediction	Jun 27, 2024	Conformal PredictionLanguage Modeling	CodeCode Available	0
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available	0
VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation	Jun 25, 2024	ARCBenchmarking	CodeCode Available	0
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration	Jun 24, 2024	DiversityMultiple-choice	—Unverified	0
HCQA @ Ego4D EgoSchema Challenge 2024	Jun 22, 2024	Caption Generation	CodeCode Available	1
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1
SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages	Jun 20, 2024	Language ModellingLarge Language Model	—Unverified	0
QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism	Jun 19, 2024	Multiple-choiceQuestion Answering	—Unverified	0
ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World	Jun 19, 2024	DiagnosticMultiple-choice	CodeCode Available	2

Show:10 25 50

← PrevPage 43 of 111Next →

No leaderboard results yet.