SOTAVerified

LAMBADA

Papers

Showing 110 of 30 papers

TitleStatusHype
Training Compute-Optimal Large Language ModelsCode6
Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at ScaleCode2
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model ParallelismCode2
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
Beyond Autoregression: Fast LLMs via Self-Distillation Through TimeCode1
Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LMCode1
Residual Shuffle-Exchange Networks for Fast Processing of Long SequencesCode1
The LAMBADA dataset: Word prediction requiring a broad discourse contextCode1
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT ModelsCode0
Inconsistencies in Masked Language ModelsCode0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.