| Sparse Modular Activation for Efficient Sequence Modeling | Jun 19, 2023 | ChunkingLanguage Modeling | CodeCode Available | 1 |
| The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks | Jun 14, 2023 | 16kClassification | CodeCode Available | 1 |
| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator | May 24, 2023 | Abstractive Text SummarizationDocument Summarization | CodeCode Available | 1 |
| Focus Your Attention (with Adaptive IIR Filters) | May 24, 2023 | Language ModellingLong-range modeling | —Unverified | 0 |
| T-former: An Efficient Transformer for Image Inpainting | May 12, 2023 | Image InpaintingLong-range modeling | CodeCode Available | 1 |
| A General-Purpose Multilingual Document Encoder | May 11, 2023 | Cross-Lingual TransferDocument Classification | CodeCode Available | 0 |
| RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution Network for Unsupervised Image Registration | May 7, 2023 | Computational EfficiencyImage Registration | CodeCode Available | 0 |
| HST-MRF: Heterogeneous Swin Transformer with Multi-Receptive Field for Medical Image Segmentation | Apr 10, 2023 | Image SegmentationLesion Segmentation | —Unverified | 0 |
| CoLT5: Faster Long-Range Transformers with Conditional Computation | Mar 17, 2023 | Long-range modeling | —Unverified | 0 |
| Hungry Hungry Hippos: Towards Language Modeling with State Space Models | Dec 28, 2022 | 8kCoreference Resolution | CodeCode Available | 2 |
| Token Transformer: Can class token help window-based transformer build better long-range interactions? | Nov 11, 2022 | image-classificationImage Classification | —Unverified | 0 |
| What Makes Convolutional Models Great on Long Sequence Modeling? | Oct 17, 2022 | Long-range modeling | CodeCode Available | 1 |
| CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling | Oct 14, 2022 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Pose Guided Human Image Synthesis with Partially Decoupled GAN | Oct 7, 2022 | DecoderImage Generation | —Unverified | 0 |
| Multi-scale Attention Network for Single Image Super-Resolution | Sep 28, 2022 | BlockingImage Super-Resolution | CodeCode Available | 1 |
| Liquid Structural State-Space Models | Sep 26, 2022 | Heart rate estimationLong-range modeling | CodeCode Available | 2 |
| Mega: Moving Average Equipped Gated Attention | Sep 21, 2022 | Image ClassificationInductive Bias | CodeCode Available | 2 |
| Adapting Pretrained Text-to-Text Models for Long Text Sequences | Sep 21, 2022 | Long-range modelingQuestion Answering | CodeCode Available | 1 |
| CNSNet: A Cleanness-Navigated-Shadow Network for Shadow Removal | Sep 6, 2022 | Long-range modelingShadow Removal | CodeCode Available | 0 |
| Simplified State Space Layers for Sequence Modeling | Aug 9, 2022 | Computational EfficiencyListOps | CodeCode Available | 2 |
| Investigating Efficiently Extending Transformers for Long Input Summarization | Aug 8, 2022 | 16kLong-range modeling | CodeCode Available | 3 |
| U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration? | Aug 7, 2022 | Image RegistrationLong-range modeling | CodeCode Available | 1 |
| Efficient Long-Text Understanding with Short-Text Models | Aug 1, 2022 | ArticlesDecoder | CodeCode Available | 1 |
| Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration | Jul 21, 2022 | Long-range modelingObject | CodeCode Available | 1 |
| How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections | Jun 24, 2022 | Long-range modelingState Space Models | CodeCode Available | 0 |
| On the Parameterization and Initialization of Diagonal State Space Models | Jun 23, 2022 | Long-range modelingState Space Models | CodeCode Available | 0 |
| 0/1 Deep Neural Networks via Block Coordinate Descent | Jun 19, 2022 | 10-shot image generation | —Unverified | 0 |
| ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths | Jun 12, 2022 | ChunkingDocument Classification | CodeCode Available | 1 |
| UL2: Unifying Language Learning Paradigms | May 10, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 1 |
| Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention | Apr 22, 2022 | Long-range modeling | CodeCode Available | 1 |
| Diagonal State Spaces are as Effective as Structured State Spaces | Mar 27, 2022 | Long-range modeling | CodeCode Available | 0 |
| SCROLLS: Standardized CompaRison Over Long Language Sequences | Jan 10, 2022 | DecoderLong-range modeling | CodeCode Available | 1 |
| Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks | Jan 6, 2022 | Audio ClassificationClassification | CodeCode Available | 1 |
| LongT5: Efficient Text-To-Text Transformer for Long Sequences | Dec 15, 2021 | Abstractive Text SummarizationLong-range modeling | CodeCode Available | 1 |
| Efficiently Modeling Long Sequences with Structured State Spaces | Oct 31, 2021 | Data AugmentationLanguage Modeling | CodeCode Available | 1 |
| Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic Interactions | Sep 20, 2021 | Long-range modeling | —Unverified | 0 |
| Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy | Sep 17, 2021 | Code CompletionCode Generation | —Unverified | 0 |
| Sparse Factorization of Large Square Matrices | Sep 16, 2021 | Long-range modeling | CodeCode Available | 0 |
| Image Super-Resolution With Non-Local Sparse Attention | Jun 19, 2021 | Image Super-ResolutionLong-range modeling | CodeCode Available | 1 |
| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 |
| Gated Relational Graph Attention Networks | Jan 1, 2021 | Graph AttentionLong-range modeling | —Unverified | 0 |
| Long Range Arena: A Benchmark for Efficient Transformers | Nov 8, 2020 | 16kBenchmarking | CodeCode Available | 1 |
| Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition | Mar 31, 2020 | 3D Action RecognitionAction Recognition | CodeCode Available | 1 |
| V4D:4D Convolutional Neural Networks for Video-level Representation Learning | Feb 18, 2020 | Long-range modelingRepresentation Learning | CodeCode Available | 1 |