| Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning | May 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation | May 16, 2025 | DiversityRAG | —Unverified | 0 | 0 |
| Dynamic Token Reduction during Generation for Vision Language Models | Jan 24, 2025 | DecoderToken Reduction | —Unverified | 0 | 0 |
| Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration | Nov 26, 2024 | Token Reduction | —Unverified | 0 | 0 |
| ZipR1: Reinforcing Token Sparsity in MLLMs | Apr 23, 2025 | Token Reduction | —Unverified | 0 | 0 |
| PAR: Prompt-Aware Token Reduction Method for Efficient Large Multimodal Models | Oct 9, 2024 | Question AnsweringRetrieval | —Unverified | 0 | 0 |
| Selective Structured State-Spaces for Long-Form Video Understanding | Mar 25, 2023 | Contrastive LearningForm | —Unverified | 0 | 0 |
| Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study | Mar 9, 2025 | QuantizationToken Reduction | —Unverified | 0 | 0 |
| Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction | Nov 30, 2024 | Bayesian OptimizationToken Reduction | —Unverified | 0 | 0 |
| STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference | May 18, 2025 | Token Reduction | —Unverified | 0 | 0 |
| DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs | Apr 23, 2025 | Token ReductionVideo Understanding | —Unverified | 0 | 0 |
| The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training | May 25, 2025 | Reinforcement Learning (RL)Token Reduction | —Unverified | 0 | 0 |
| DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models | May 20, 2025 | GSM8KMathematical Reasoning | —Unverified | 0 | 0 |
| Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models | Mar 21, 2025 | Computational EfficiencyToken Reduction | —Unverified | 0 | 0 |
| TPC-ViT: Token Propagation Controller for Efficient Vision Transformer | Jan 3, 2024 | Token Reduction | —Unverified | 0 | 0 |
| Deploying Foundation Model Powered Agent Services: A Survey | Dec 18, 2024 | modelModel Compression | —Unverified | 0 | 0 |
| Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Jun 6, 2025 | Depth Estimationobject-detection | —Unverified | 0 | 0 |
| TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer | Nov 19, 2022 | 3D geometryHuman Mesh Recovery | —Unverified | 0 | 0 |
| Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings | Jun 5, 2025 | RetrievalToken Reduction | —Unverified | 0 | 0 |
| Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems | Oct 3, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation | Dec 10, 2024 | General KnowledgeText Generation | —Unverified | 0 | 0 |
| Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers | Mar 14, 2025 | GPUMamba | —Unverified | 0 | 0 |
| Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer | Aug 30, 2024 | Token Reduction | —Unverified | 0 | 0 |
| VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models | May 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Knowing When to Stop: Dynamic Context Cutoff for Large Language Models | Feb 3, 2025 | Token Reduction | —Unverified | 0 | 0 |
| AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration | Dec 16, 2024 | DenoisingToken Reduction | —Unverified | 0 | 0 |
| ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition | Dec 21, 2024 | Efficient ViTsToken Reduction | —Unverified | 0 | 0 |
| Learning Free Token Reduction for Multi-Modal Large Language Models | Jan 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |