| PIXAR: Auto-Regressive Language Modeling in Pixel Space | Jan 6, 2024 | DecoderLAMBADA | —Unverified | 0 |
| Stay on topic with Classifier-Free Guidance | Jun 30, 2023 | Code GenerationCommon Sense Reasoning | —Unverified | 0 |
| SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning | Feb 20, 2024 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time | Dec 1, 2019 | LAMBADALanguage Modeling | CodeCode Available | 0 |
| Neural Shuffle-Exchange Networks -- Sequence Processing in O(n log n) Time | Jul 18, 2019 | LAMBADALanguage Modeling | CodeCode Available | 0 |
| The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models | Aug 13, 2021 | LAMBADAText Generation | CodeCode Available | 0 |
| Universal Transformers | Jul 10, 2018 | Inductive BiasLAMBADA | CodeCode Available | 0 |
| Inconsistencies in Masked Language Models | Dec 30, 2022 | LAMBADAMMLU | CodeCode Available | 0 |
| Entity Tracking Improves Cloze-style Reading Comprehension | Oct 5, 2018 | LAMBADAReading Comprehension | CodeCode Available | 0 |
| Not Enough Data? Deep Learning to the Rescue! | Nov 8, 2019 | Data AugmentationDeep Learning | CodeCode Available | 0 |