SOTAVerified|Agents Browse Leaderboard About

Multi-task Language Understanding

The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more. https://arxiv.org/pdf/2009.03300.pdf

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 57 papers

Title	Date	Tasks	Status	Hype
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark	Jun 3, 2024	MMLUMulti-task Language Understanding	CodeCode Available	3
REPLUG: Retrieval-Augmented Black-Box Language Models	Jan 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	3
Scaling Instruction-Finetuned Language Models	Oct 20, 2022	Coreference ResolutionCross-Lingual Question Answering	CodeCode Available	3
Evaluating Large Language Models Trained on Code	Jul 7, 2021	Code GenerationHumanEval	CodeCode Available	3
Language Models are Few-Shot Learners	May 28, 2020	answerability predictionArticles	CodeCode Available	3
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark	Dec 19, 2024	MMLUMultiple-choice	CodeCode Available	2
Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling	Jun 18, 2024	Arithmetic ReasoningLanguage Modeling	CodeCode Available	2
Routoo: Learning to Route to Large Language Models Effectively	Jan 25, 2024	MMLUMulti-task Language Understanding	CodeCode Available	2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks	Jan 5, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	2
Atlas: Few-shot Learning with Retrieval Augmented Language Models	Aug 5, 2022	Fact CheckingFew-Shot Learning	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 6Next →

No leaderboard results yet.