| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 |
| Efficient Long-Text Understanding with Short-Text Models | Aug 1, 2022 | ArticlesDecoder | CodeCode Available | 1 |
| Efficiently Modeling Long Sequences with Structured State Spaces | Oct 31, 2021 | Data AugmentationLanguage Modeling | CodeCode Available | 1 |
| Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator | May 24, 2023 | Abstractive Text SummarizationDocument Summarization | CodeCode Available | 1 |
| GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models | Dec 17, 2024 | Long-range modeling | CodeCode Available | 1 |
| Hierarchical Separable Video Transformer for Snapshot Compressive Imaging | Jul 16, 2024 | Inductive BiasLong-range modeling | CodeCode Available | 1 |
| Image Super-Resolution With Non-Local Sparse Attention | Jun 19, 2021 | Image Super-ResolutionLong-range modeling | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| KM-UNet KAN Mamba UNet for medical image segmentation | Jan 5, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 1 |
| Long Range Arena: A Benchmark for Efficient Transformers | Nov 8, 2020 | 16kBenchmarking | CodeCode Available | 1 |
| Long Range Propagation on Continuous-Time Dynamic Graphs | Jun 4, 2024 | Long-range modeling | CodeCode Available | 1 |
| LongT5: Efficient Text-To-Text Transformer for Long Sequences | Dec 15, 2021 | Abstractive Text SummarizationLong-range modeling | CodeCode Available | 1 |
| Multi-scale Attention Network for Single Image Super-Resolution | Sep 28, 2022 | BlockingImage Super-Resolution | CodeCode Available | 1 |
| Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention | Apr 22, 2022 | Long-range modeling | CodeCode Available | 1 |
| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Oct 9, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Recurrent Distance Filtering for Graph Representation Learning | Dec 3, 2023 | Graph ClassificationGraph Representation Learning | CodeCode Available | 1 |
| SCROLLS: Standardized CompaRison Over Long Language Sequences | Jan 10, 2022 | DecoderLong-range modeling | CodeCode Available | 1 |
| Sparse Modular Activation for Efficient Sequence Modeling | Jun 19, 2023 | ChunkingLanguage Modeling | CodeCode Available | 1 |
| Spatio-Spectral Graph Neural Networks | May 29, 2024 | GPUGraph Classification | CodeCode Available | 1 |
| T-former: An Efficient Transformer for Image Inpainting | May 12, 2023 | Image InpaintingLong-range modeling | CodeCode Available | 1 |
| The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks | Jun 14, 2023 | 16kClassification | CodeCode Available | 1 |
| U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration? | Aug 7, 2022 | Image RegistrationLong-range modeling | CodeCode Available | 1 |
| UL2: Unifying Language Learning Paradigms | May 10, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 1 |
| U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV | Jul 15, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 1 |