SOTAVerified

Multi-task Language Understanding

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Papers

Showing 3140 of 57 papers

TitleStatusHype
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLUCode1
Are Human-generated Demonstrations Necessary for In-context Learning?Code1
UL2: Unifying Language Learning ParadigmsCode1
GPT-NeoX-20B: An Open-Source Autoregressive Language ModelCode1
Merging Models with Fisher-Weighted AveragingCode1
UnifiedQA: Crossing Format Boundaries With a Single QA SystemCode1
RoBERTa: A Robustly Optimized BERT Pretraining ApproachCode1
Language Models are Unsupervised Multitask LearnersCode1
Measuring Hong Kong Massive Multi-Task Language Understanding0
Effectiveness of Zero-shot-CoT in Japanese Prompts0
Show:102550
← PrevPage 4 of 6Next →

No leaderboard results yet.