| Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Dec 1, 2023 | 2D Pose EstimationCommon Sense Reasoning | CodeCode Available | 6 |
| MedMamba: Vision Mamba for Medical Image Classification | Mar 6, 2024 | Classificationimage-classification | CodeCode Available | 4 |
| MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection | Apr 9, 2024 | Anomaly DetectionDecoder | CodeCode Available | 3 |
| Investigating Efficiently Extending Transformers for Long Input Summarization | Aug 8, 2022 | 16kLong-range modeling | CodeCode Available | 3 |
| LION: Linear Group RNN for 3D Object Detection in Point Clouds | Jul 25, 2024 | 3D Object DetectionLong-range modeling | CodeCode Available | 3 |
| MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration | Jan 25, 2024 | Computed Tomography (CT)Image Registration | CodeCode Available | 2 |
| Emulating Self-attention with Convolution for Efficient Image Super-Resolution | Mar 9, 2025 | Computational EfficiencyImage Super-Resolution | CodeCode Available | 2 |
| Hungry Hungry Hippos: Towards Language Modeling with State Space Models | Dec 28, 2022 | 8kCoreference Resolution | CodeCode Available | 2 |
| Simplified State Space Layers for Sequence Modeling | Aug 9, 2022 | Computational EfficiencyListOps | CodeCode Available | 2 |
| MambaFusion: Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection | Jul 6, 2025 | 3D Object DetectionAttribute | CodeCode Available | 2 |
| PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model | Aug 7, 2024 | 3D Human Pose EstimationLong-range modeling | CodeCode Available | 2 |
| nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model | Feb 5, 2024 | 3D Medical Imaging SegmentationImage Segmentation | CodeCode Available | 2 |
| Liquid Structural State-Space Models | Sep 26, 2022 | Heart rate estimationLong-range modeling | CodeCode Available | 2 |
| TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts | Jul 28, 2023 | Long-range modelingMixture-of-Experts | CodeCode Available | 2 |
| LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation | Mar 12, 2024 | Image SegmentationLong-range modeling | CodeCode Available | 2 |
| Mega: Moving Average Equipped Gated Attention | Sep 21, 2022 | Image ClassificationInductive Bias | CodeCode Available | 2 |
| DVMSR: Distillated Vision Mamba for Efficient Super-Resolution | May 5, 2024 | Image Super-ResolutionLong-range modeling | CodeCode Available | 2 |
| MambaVC: Learned Visual Compression with Selective State Spaces | May 24, 2024 | Long-range modelingState Space Models | CodeCode Available | 2 |
| CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling | Oct 14, 2022 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Adapting Pretrained Text-to-Text Models for Long Text Sequences | Sep 21, 2022 | Long-range modelingQuestion Answering | CodeCode Available | 1 |
| A Simple LLM Framework for Long-Range Video Question-Answering | Dec 28, 2023 | EgoSchemaLanguage Modelling | CodeCode Available | 1 |
| ChordMixer: A Scalable Neural Attention Model for Sequences with Different Lengths | Jun 12, 2022 | ChunkingDocument Classification | CodeCode Available | 1 |
| Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks | Jan 6, 2022 | Audio ClassificationClassification | CodeCode Available | 1 |
| CT-Mamba: A Hybrid Convolutional State Space Model for Low-Dose CT Denoising | Nov 12, 2024 | DenoisingDiagnostic | CodeCode Available | 1 |
| Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition | Mar 31, 2020 | 3D Action RecognitionAction Recognition | CodeCode Available | 1 |
| DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning | May 25, 2021 | Action RecognitionLong-range modeling | CodeCode Available | 1 |
| Efficient Long-Text Understanding with Short-Text Models | Aug 1, 2022 | ArticlesDecoder | CodeCode Available | 1 |
| Efficiently Modeling Long Sequences with Structured State Spaces | Oct 31, 2021 | Data AugmentationLanguage Modeling | CodeCode Available | 1 |
| Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator | May 24, 2023 | Abstractive Text SummarizationDocument Summarization | CodeCode Available | 1 |
| GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models | Dec 17, 2024 | Long-range modeling | CodeCode Available | 1 |
| Hierarchical Separable Video Transformer for Snapshot Compressive Imaging | Jul 16, 2024 | Inductive BiasLong-range modeling | CodeCode Available | 1 |
| Image Super-Resolution With Non-Local Sparse Attention | Jun 19, 2021 | Image Super-ResolutionLong-range modeling | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| KM-UNet KAN Mamba UNet for medical image segmentation | Jan 5, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 1 |
| Long Range Arena: A Benchmark for Efficient Transformers | Nov 8, 2020 | 16kBenchmarking | CodeCode Available | 1 |
| Long Range Propagation on Continuous-Time Dynamic Graphs | Jun 4, 2024 | Long-range modeling | CodeCode Available | 1 |
| LongT5: Efficient Text-To-Text Transformer for Long Sequences | Dec 15, 2021 | Abstractive Text SummarizationLong-range modeling | CodeCode Available | 1 |
| Multi-scale Attention Network for Single Image Super-Resolution | Sep 28, 2022 | BlockingImage Super-Resolution | CodeCode Available | 1 |
| Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention | Apr 22, 2022 | Long-range modeling | CodeCode Available | 1 |
| Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | May 31, 2023 | D4RLLanguage Modelling | CodeCode Available | 1 |
| QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Oct 9, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Recurrent Distance Filtering for Graph Representation Learning | Dec 3, 2023 | Graph ClassificationGraph Representation Learning | CodeCode Available | 1 |
| SCROLLS: Standardized CompaRison Over Long Language Sequences | Jan 10, 2022 | DecoderLong-range modeling | CodeCode Available | 1 |
| Sparse Modular Activation for Efficient Sequence Modeling | Jun 19, 2023 | ChunkingLanguage Modeling | CodeCode Available | 1 |
| Spatio-Spectral Graph Neural Networks | May 29, 2024 | GPUGraph Classification | CodeCode Available | 1 |
| T-former: An Efficient Transformer for Image Inpainting | May 12, 2023 | Image InpaintingLong-range modeling | CodeCode Available | 1 |
| The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks | Jun 14, 2023 | 16kClassification | CodeCode Available | 1 |
| U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration? | Aug 7, 2022 | Image RegistrationLong-range modeling | CodeCode Available | 1 |
| UL2: Unifying Language Learning Paradigms | May 10, 2022 | Arithmetic ReasoningCommon Sense Reasoning | CodeCode Available | 1 |
| U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV | Jul 15, 2025 | Computational EfficiencyImage Segmentation | CodeCode Available | 1 |