SOTAVerified|Agents Browse Leaderboard About

Multi-task Language Understanding

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 57 papers

Title	Date	Tasks	Status	Hype
Solving Quantitative Reasoning Problems with Language Models	Jun 29, 2022	Arithmetic ReasoningLanguage Modeling	CodeCode Available	2
PaLM: Scaling Language Modeling with Pathways	Apr 5, 2022	Auto DebuggingCode Generation	CodeCode Available	2
Scaling Language Models: Methods, Analysis & Insights from Training Gopher	Dec 8, 2021	Abstract AlgebraAnachronisms	CodeCode Available	2
Measuring Massive Multitask Language Understanding	Sep 7, 2020	Elementary MathematicsMulti-task Language Understanding	CodeCode Available	2
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations	Sep 26, 2019	Common Sense ReasoningGPU	CodeCode Available	2
TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages	Feb 16, 2025	Machine TranslationMMLU	CodeCode Available	1
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles	Jun 18, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	1
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic	Feb 20, 2024	ArabicMMLULanguage Model Evaluation	CodeCode Available	1
Gemini: A Family of Highly Capable Multimodal Models	Dec 19, 2023	1 Image, 2*2 StitchingArithmetic Reasoning	CodeCode Available	1
MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models	Oct 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 6Next →

No leaderboard results yet.