| Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation | Dec 13, 2024 | Token Reduction | CodeCode Available | 1 |
| CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms | May 22, 2025 | Token Reduction | CodeCode Available | 1 |
| Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers | Jun 3, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 1 |
| Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs | Apr 16, 2024 | Long-Context UnderstandingToken Reduction | CodeCode Available | 1 |
| Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters | Nov 5, 2024 | Token ReductionVisual Reasoning | CodeCode Available | 1 |
| Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training | Dec 17, 2024 | MambaToken Reduction | CodeCode Available | 1 |
| Bridging Local Details and Global Context in Text-Attributed Graphs | Jun 18, 2024 | Representation LearningToken Reduction | CodeCode Available | 1 |
| FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models | May 26, 2025 | Token Reduction | CodeCode Available | 1 |
| ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers | Jun 14, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model | Oct 3, 2024 | Emotion RecognitionLanguage Modeling | CodeCode Available | 1 |