SOTAVerified

Multiple-choice

Papers

Showing 191200 of 1107 papers

TitleStatusHype
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze RewardCode1
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense ReasoningCode1
IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerceCode1
Multiple-Choice Questions are Efficient and Robust LLM EvaluatorsCode1
Explaining NLP Models via Minimal Contrastive Editing (MiCE)Code1
A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies.Code1
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese JournalismCode1
Explicit Planning Helps Language Models in Logical ReasoningCode1
AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language ModelsCode1
IRLBench: A Multi-modal, Culturally Grounded, Parallel Irish-English Benchmark for Open-Ended LLM Reasoning EvaluationCode1
Show:102550
← PrevPage 20 of 111Next →

No leaderboard results yet.