| Which Tokens to Use? Investigating Token Reduction in Vision Transformers | Aug 9, 2023 | Classificationimage-classification | CodeCode Available | 1 |
| Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers | Jun 3, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 1 |
| PuMer: Pruning and Merging Tokens for Efficient Vision Language Models | May 27, 2023 | Token Reduction | CodeCode Available | 1 |
| AdaViT: Adaptive Tokens for Efficient Vision Transformer | Dec 14, 2021 | Efficient ViTsimage-classification | CodeCode Available | 1 |
| TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference | May 25, 2021 | Token Reduction | CodeCode Available | 1 |
| Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Jun 6, 2025 | Depth Estimationobject-detection | —Unverified | 0 |
| Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings | Jun 5, 2025 | RetrievalToken Reduction | —Unverified | 0 |
| Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers | Jun 5, 2025 | GPUText-to-Video Generation | —Unverified | 0 |
| VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models | May 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training | May 25, 2025 | Reinforcement Learning (RL)Token Reduction | —Unverified | 0 |
| Not All Tokens Are What You Need In Thinking | May 23, 2025 | AllToken Reduction | CodeCode Available | 0 |
| Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning | May 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models | May 20, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 |
| STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference | May 18, 2025 | Token Reduction | —Unverified | 0 |
| Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration | May 16, 2025 | DenoisingToken Reduction | CodeCode Available | 0 |
| EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation | May 16, 2025 | DiversityRAG | —Unverified | 0 |
| Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method | May 12, 2025 | Semantic CompressionSemantic Similarity | —Unverified | 0 |
| ZipR1: Reinforcing Token Sparsity in MLLMs | Apr 23, 2025 | Token Reduction | —Unverified | 0 |
| DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs | Apr 23, 2025 | Token ReductionVideo Understanding | —Unverified | 0 |
| Dynamic Compressing Prompts for Efficient Inference of Large Language Models | Apr 15, 2025 | Token Reduction | CodeCode Available | 0 |
| Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features | Apr 1, 2025 | Token Reduction | —Unverified | 0 |
| Local Information Matters: Inference Acceleration For Grounded Conversation Generation Models Through Adaptive Local-Aware Token Pruning | Mar 31, 2025 | Semantic SegmentationToken Reduction | —Unverified | 0 |
| Faster Parameter-Efficient Tuning with Token Redundancy Reduction | Mar 26, 2025 | Token Reduction | CodeCode Available | 0 |
| Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models | Mar 21, 2025 | Computational EfficiencyToken Reduction | —Unverified | 0 |
| Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers | Mar 14, 2025 | GPUMamba | —Unverified | 0 |