| Inconsistencies in Masked Language Models | Dec 30, 2022 | LAMBADAMMLU | CodeCode Available | 0 |
| LAMBADA: Backward Chaining for Automated Reasoning in Natural Language | Dec 20, 2022 | LAMBADALogical Reasoning | —Unverified | 0 |
| Leveraging Relaxed Equilibrium by Lazy Transition for Sequence Modeling | May 1, 2022 | LAMBADALearning to Execute | —Unverified | 0 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 |
| CoreLM: Coreference-aware Language Model Fine-Tuning | Nov 4, 2021 | LAMBADALanguage Modeling | —Unverified | 0 |
| The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models | Aug 13, 2021 | LAMBADAText Generation | CodeCode Available | 0 |
| E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks | Nov 10, 2020 | LAMBADALanguage Modeling | —Unverified | 0 |
| Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences | Apr 6, 2020 | LAMBADALanguage Modelling | CodeCode Available | 1 |
| Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time | Dec 1, 2019 | LAMBADALanguage Modeling | CodeCode Available | 0 |