SOTAVerified

MMLU

Papers

Showing 101110 of 340 papers

TitleStatusHype
QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-TuningCode0
Probing then Editing Response Personality of Large Language ModelsCode0
Post-Hoc Reversal: Are We Selecting Models Prematurely?Code0
OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like MechanismsCode0
ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case StudyCode0
Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension DiscrepancyCode0
Evaluation of Large Language Models via Coupled Token GenerationCode0
Noise Injection Reveals Hidden Capabilities of Sandbagging Language ModelsCode0
Inconsistencies in Masked Language ModelsCode0
CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data PartitionsCode0
Show:102550
← PrevPage 11 of 34Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1go ahead, make my dataFinal_score61.72Unverified
2#GreedyCowFinal_score61.63Unverified
3Don't Ask Us yFinal_score61.4Unverified
4Data_and_ConfusedFinal_score60.96Unverified
5WafflesFinal_score60.91Unverified
6raakaFinal_score60.91Unverified
7Team ProcrustinationFinal_score60.64Unverified
8Axiom Consulting PartnersFinal_score60.63Unverified
9Lets_Be_FairFinal_score60.23Unverified
10goonersFinal_score60.22Unverified
#ModelMetricClaimedVerifiedStatus
1Orange-mini0-shot MRR99.19Unverified
#ModelMetricClaimedVerifiedStatus
1HybridBeam+SI-SDRi13.3Unverified