| DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation | Nov 6, 2024 | Image GenerationInductive Bias | CodeCode Available | 1 | 5 |
| EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention | Oct 10, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 1 | 5 |
| DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets | Apr 3, 2024 | Image ClassificationInductive Bias | CodeCode Available | 1 | 5 |
| Emergent Representations of Program Semantics in Language Models Trained on Programs | May 18, 2023 | Inductive BiasLanguage Modelling | CodeCode Available | 1 | 5 |
| Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks | Jun 16, 2020 | Graph Neural NetworkHuman-Object Interaction Detection | CodeCode Available | 1 | 5 |
| Examining the Inductive Bias of Neural Language Models with Artificial Languages | Jun 2, 2021 | Inductive Bias | CodeCode Available | 1 | 5 |
| Exact Hard Monotonic Attention for Character-Level Transduction | May 15, 2019 | Hard AttentionInductive Bias | CodeCode Available | 1 | 5 |
| Ewald-based Long-Range Message Passing for Molecular Graphs | Mar 8, 2023 | Inductive Bias | CodeCode Available | 1 | 5 |
| Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models | Jul 24, 2024 | ARCInductive Bias | CodeCode Available | 1 | 5 |
| Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth | Mar 5, 2021 | AllInductive Bias | CodeCode Available | 1 | 5 |