SOTAVerified|Agents Browse Leaderboard About

MMLU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 221–230 of 340 papers

Title	Date	Tasks	Status	Hype	Score
Transcending Scaling Laws with 0.1% Extra Compute	Oct 20, 2022	Arithmetic ReasoningCross-Lingual Question Answering	—Unverified	0	0
Transferable text data distillation by trajectory matching	Apr 14, 2025	ARCLarge Language Model	—Unverified	0	0
Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests	Feb 20, 2025	Logical ReasoningMMLU	—Unverified	0	0
Understanding Finetuning for Factual Knowledge Extraction	Jun 20, 2024	MMLUQuestion Answering	—Unverified	0	0
Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size	Mar 6, 2025	MMLUQuantization	—Unverified	0	0
Unraveling Indirect In-Context Learning Using Influence Functions	Jan 1, 2025	In-Context LearningInformativeness	—Unverified	0	0
Evaluating Mathematical Reasoning Across Large Language Models: A Fine-Grained Approach	Mar 13, 2025	Formal LogicMathematical Reasoning	—Unverified	0	0
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs	Dec 17, 2024	MMLU	—Unverified	0	0
Upcycling Large Language Models into Mixture of Experts	Oct 10, 2024	Mixture-of-ExpertsMMLU	—Unverified	0	0
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content	Jun 25, 2025	ArticlesContinual Pretraining	—Unverified	0	0

Show:10 25 50

← PrevPage 23 of 34Next →

All datasets SIOP 2020/2021 MMLU-Pro VCTK

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	go ahead, make my data	Final_score	61.72	—	Unverified
2	#GreedyCow	Final_score	61.63	—	Unverified
3	Don't Ask Us y	Final_score	61.4	—	Unverified
4	Data_and_Confused	Final_score	60.96	—	Unverified
5	Waffles	Final_score	60.91	—	Unverified
6	raaka	Final_score	60.91	—	Unverified
7	Team Procrustination	Final_score	60.64	—	Unverified
8	Axiom Consulting Partners	Final_score	60.63	—	Unverified
9	Lets_Be_Fair	Final_score	60.23	—	Unverified
10	gooners	Final_score	60.22	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Orange-mini	0-shot MRR	99.19	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HybridBeam+	SI-SDRi	13.3	—	Unverified