SOTAVerified

LAMBADA

Papers

Showing 130 of 30 papers

TitleStatusHype
Training Compute-Optimal Large Language ModelsCode6
Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at ScaleCode2
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model ParallelismCode2
Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LMCode1
Beyond Autoregression: Fast LLMs via Self-Distillation Through TimeCode1
Residual Shuffle-Exchange Networks for Fast Processing of Long SequencesCode1
The LAMBADA dataset: Word prediction requiring a broad discourse contextCode1
Matryoshka Model Learning for Improved Elastic Student Models0
AdaGC: Improving Training Stability for Large Language Model Pretraining0
SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning0
PIXAR: Auto-Regressive Language Modeling in Pixel Space0
Concise and Organized Perception Facilitates Reasoning in Large Language Models0
Headless Language Models: Learning without Predicting with Contrastive Weight Tying0
Stay on topic with Classifier-Free Guidance0
Inconsistencies in Masked Language ModelsCode0
LAMBADA: Backward Chaining for Automated Reasoning in Natural Language0
Leveraging Relaxed Equilibrium by Lazy Transition for Sequence Modeling0
CoreLM: Coreference-aware Language Model Fine-Tuning0
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT ModelsCode0
E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks0
Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) TimeCode0
Attending to Entities for Better Text Understanding0
Not Enough Data? Deep Learning to the Rescue!Code0
Neural Shuffle-Exchange Networks -- Sequence Processing in O(n log n) TimeCode0
Entity Tracking Improves Cloze-style Reading ComprehensionCode0
Universal TransformersCode0
Neural Models for Reasoning over Multiple Mentions using Coreference0
Linguistic Knowledge as Memory for Recurrent Neural Networks0
Broad Context Language Modeling as Reading Comprehension0
Show:102550

No leaderboard results yet.