SOTAVerified

MMLU

Papers

Showing 8190 of 340 papers

TitleStatusHype
Unfamiliar Finetuning Examples Control How Language Models HallucinateCode1
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question AnsweringCode1
Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model OptimizersCode1
Gemini: A Family of Highly Capable Multimodal ModelsCode1
Prompt Optimization via Adversarial In-Context LearningCode1
Efficient Online Data Mixing For Language Model Pre-TrainingCode1
ArcMMLU: A Library and Information Science Benchmark for Large Language ModelsCode1
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and QuantizationCode1
An Open Source Data Contamination Report for Large Language ModelsCode1
Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language ModelsCode1
Show:102550
← PrevPage 9 of 34Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1go ahead, make my dataFinal_score61.72Unverified
2#GreedyCowFinal_score61.63Unverified
3Don't Ask Us yFinal_score61.4Unverified
4Data_and_ConfusedFinal_score60.96Unverified
5WafflesFinal_score60.91Unverified
6raakaFinal_score60.91Unverified
7Team ProcrustinationFinal_score60.64Unverified
8Axiom Consulting PartnersFinal_score60.63Unverified
9Lets_Be_FairFinal_score60.23Unverified
10goonersFinal_score60.22Unverified
#ModelMetricClaimedVerifiedStatus
1Orange-mini0-shot MRR99.19Unverified
#ModelMetricClaimedVerifiedStatus
1HybridBeam+SI-SDRi13.3Unverified