| LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding | Oct 22, 2024 | Token ReductionVideo Question Answering | CodeCode Available | 3 | 5 |
| Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality | May 23, 2025 | In-Context LearningToken Reduction | CodeCode Available | 3 | 5 |
| When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning | Mar 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory | May 29, 2025 | Contrastive LearningText Retrieval | CodeCode Available | 2 | 5 |
| PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models | Apr 11, 2025 | ClusteringLanguage Modeling | CodeCode Available | 2 | 5 |
| LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | Mar 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 | 5 |
| Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction | Sep 25, 2024 | GPUToken Reduction | CodeCode Available | 2 | 5 |
| FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models | Dec 30, 2024 | Question AnsweringToken Reduction | CodeCode Available | 2 | 5 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 | 5 |
| AdaViT: Adaptive Tokens for Efficient Vision Transformer | Dec 14, 2021 | Efficient ViTsimage-classification | CodeCode Available | 1 | 5 |
| PuMer: Pruning and Merging Tokens for Efficient Vision Language Models | May 27, 2023 | Token Reduction | CodeCode Available | 1 | 5 |
| CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms | May 22, 2025 | Token Reduction | CodeCode Available | 1 | 5 |
| Learning Compact Vision Tokens for Efficient Large Multimodal Models | Jun 8, 2025 | Multimodal ReasoningToken Reduction | CodeCode Available | 1 | 5 |
| Rethinking Token Reduction for State Space Models | Oct 16, 2024 | MambaState Space Models | CodeCode Available | 1 | 5 |
| Streamline Without Sacrifice -- Squeeze out Computation Redundancy in LMM | May 21, 2025 | DecoderToken Reduction | CodeCode Available | 1 | 5 |
| Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers | Jun 3, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 1 | 5 |
| FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance | Jan 5, 2025 | Token Reduction | CodeCode Available | 1 | 5 |
| Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs | Apr 16, 2024 | Long-Context UnderstandingToken Reduction | CodeCode Available | 1 | 5 |
| Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters | Nov 5, 2024 | Token ReductionVisual Reasoning | CodeCode Available | 1 | 5 |
| FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models | May 26, 2025 | Token Reduction | CodeCode Available | 1 | 5 |
| Bridging Local Details and Global Context in Text-Attributed Graphs | Jun 18, 2024 | Representation LearningToken Reduction | CodeCode Available | 1 | 5 |
| Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs | Sep 17, 2024 | Question AnsweringToken Reduction | CodeCode Available | 1 | 5 |
| Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation | Dec 13, 2024 | Token Reduction | CodeCode Available | 1 | 5 |
| FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model | Oct 3, 2024 | Emotion RecognitionLanguage Modeling | CodeCode Available | 1 | 5 |
| ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers | Jun 14, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 | 5 |