SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 761–770 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
The Achievement of Higher Flexibility in Multiple Choice-based Tests Using Image Classification Techniques	Nov 2, 2017	BIG-bench Machine LearningGeneral Classification	—Unverified	0	0
AraSTEM: A Native Arabic Multiple Choice Question Benchmark for Evaluating LLMs Knowledge In STEM Subjects	Dec 31, 2024	BenchmarkingMultiple-choice	—Unverified	0	0
AraTrust: An Evaluation of Trustworthiness for LLMs in Arabic	Mar 14, 2024	EthicsMultiple-choice	—Unverified	0	0
A recent evaluation on the performance of LLMs on radiation oncology physics using questions of randomly shuffled options	Dec 14, 2024	Multiple-choice	—Unverified	0	0
Are LLM-generated plain language summaries truly understandable? A large-scale crowdsourced evaluation	May 15, 2025	InformativenessMultiple-choice	—Unverified	0	0
A review of faithfulness metrics for hallucination assessment in Large Language Models	Dec 31, 2024	BenchmarkingHallucination	—Unverified	0	0
Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of Model Uncertainty for Question Difficulty Estimation	Dec 16, 2024	Multiple-choice	—Unverified	0	0
ARGUS: Hallucination and Omission Evaluation in Video-LLMs	Jun 9, 2025	DescriptiveForm	—Unverified	0	0
ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition	Oct 8, 2024	Action RecognitionMultiple-choice	—Unverified	0	0
Aryl: An Elastic Cluster Scheduler for Deep Learning	Feb 16, 2022	Deep LearningGPU	—Unverified	0	0

Show:10 25 50

← PrevPage 77 of 111Next →

No leaderboard results yet.