SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 981–990 of 1107 papers

Title	Date	Tasks	Status	Hype
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available	0
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval	Oct 3, 2023	ArticlesDecision Making	CodeCode Available	0
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare	Feb 22, 2025	Decision MakingMultiple-choice	CodeCode Available	0
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles	Feb 19, 2024	Multiple-choiceUncertainty Quantification	CodeCode Available	0
Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings	Dec 9, 2024	Multiple-choice	CodeCode Available	0
Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam	Jun 14, 2024	FairnessLogical Reasoning	CodeCode Available	0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models	May 13, 2025	FormMultiple-choice	CodeCode Available	0
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach	Sep 9, 2024	Computational EfficiencyContinual Pretraining	CodeCode Available	0
Evaluating Large Language Model Biases in Persona-Steered Generation	May 30, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM	Mar 12, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available	0

Show:10 25 50

← PrevPage 99 of 111Next →

No leaderboard results yet.