SOTAVerified

Multiple-choice

Papers

Showing 241250 of 1107 papers

TitleStatusHype
Language Model Uncertainty Quantification with Attention ChainCode1
Large Language Models Encode Clinical KnowledgeCode1
CommonsenseQA: A Question Answering Challenge Targeting Commonsense KnowledgeCode1
LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language ModelsCode1
Complex Reasoning over Logical Queries on Commonsense Knowledge GraphsCode1
Assessing the Chemical Intelligence of Large Language ModelsCode1
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?Code1
MedQA-CS: Benchmarking Large Language Models Clinical Skills Using an AI-SCE FrameworkCode1
Constructing Narrative Event Evolutionary Graph for Script Event PredictionCode1
Mobile-MMLU: A Mobile Intelligence Language Understanding BenchmarkCode1
Show:102550
← PrevPage 25 of 111Next →

No leaderboard results yet.