SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 981–990 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
Exploring the Comprehension of ChatGPT in Traditional Chinese Medicine Knowledge	Mar 14, 2024	Multiple-choice	—Unverified	0	0
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?	Sep 19, 2019	ArticlesLanguage Modeling	—Unverified	0	0
Exposing the Limits of Video-Text Models through Contrast Sets	Jan 16, 2022	Language ModelingLanguage Modelling	—Unverified	0	0
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History	Jan 15, 2025	Multiple-choiceQuestion Answering	—Unverified	0	0
FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees	Nov 4, 2024	Multiple-choiceQuestion Answering	—Unverified	0	0
Towards Multistage Design of Modular Systems	Jun 19, 2013	Multiple-choice	—Unverified	0	0
FAMULUS: Interactive Annotation and Feedback Generation for Teaching Diagnostic Reasoning	Aug 29, 2019	DiagnosticMultiple-choice	—Unverified	0	0
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models	Apr 20, 2025	DescriptiveEthics	—Unverified	0	0
Town Hall Debate Prompting: Enhancing Logical Reasoning in LLMs through Multi-Persona Interaction	Jan 28, 2025	Logical ReasoningMultiple-choice	—Unverified	0	0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding	Mar 19, 2025	BenchmarkingMultiple-choice	—Unverified	0	0

Show:10 25 50

← PrevPage 99 of 111Next →

No leaderboard results yet.