SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
Explaining NLP Models via Minimal Contrastive Editing (MiCE)	Dec 27, 2020	counterfactualMultiple-choice	CodeCode Available	1	5
FaceXBench: Evaluating Multimodal LLMs on Face Understanding	Jan 17, 2025	FairnessMultiple-choice	CodeCode Available	1	5
Boosting Healthcare LLMs Through Retrieved Context	Sep 23, 2024	BenchmarkingMultiple-choice	CodeCode Available	1	5
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages	Nov 25, 2024	AllLong Question Answer	CodeCode Available	1	5
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models	Oct 8, 2023	Distractor GenerationLanguage Modelling	CodeCode Available	1	5
Bridging Video-text Retrieval with Multiple Choice Questions	Jan 13, 2022	Action RecognitionLinear evaluation	CodeCode Available	1	5
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models	Oct 5, 2023	Common Sense ReasoningMultiple-choice	CodeCode Available	1	5
Evaluating the Knowledge Dependency of Questions	Nov 21, 2022	Multiple-choice	CodeCode Available	1	5
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation	Sep 19, 2023	Language Model EvaluationLanguage Modeling	CodeCode Available	1	5
GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing	May 16, 2025	Instruction FollowingMultiple-choice	CodeCode Available	1	5

Show:10 25 50

← PrevPage 9 of 111Next →

No leaderboard results yet.