| Token Cropr: Faster ViTs for Quite a Few Tasks | Dec 1, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction | Nov 30, 2024 | Bayesian OptimizationToken Reduction | —Unverified | 0 |
| Efficient Multi-modal Large Language Models via Visual Token Grouping | Nov 26, 2024 | Image CaptioningQuestion Answering | —Unverified | 0 |
| Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration | Nov 26, 2024 | Token Reduction | —Unverified | 0 |
| freePruner: A Training-free Approach for Large Multimodal Model Acceleration | Nov 23, 2024 | QuantizationQuestion Answering | —Unverified | 0 |
| Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters | Nov 5, 2024 | Token ReductionVisual Reasoning | CodeCode Available | 1 |
| LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding | Oct 22, 2024 | Token ReductionVideo Question Answering | CodeCode Available | 3 |
| Rethinking Token Reduction for State Space Models | Oct 16, 2024 | MambaState Space Models | CodeCode Available | 1 |
| PAR: Prompt-Aware Token Reduction Method for Efficient Large Multimodal Models | Oct 9, 2024 | Question AnsweringRetrieval | —Unverified | 0 |
| FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model | Oct 3, 2024 | Emotion RecognitionLanguage Modeling | CodeCode Available | 1 |
| Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems | Oct 3, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction | Sep 25, 2024 | GPUToken Reduction | CodeCode Available | 2 |
| Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs | Sep 17, 2024 | Question AnsweringToken Reduction | CodeCode Available | 1 |
| Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer | Aug 30, 2024 | Token Reduction | —Unverified | 0 |
| Bridging Local Details and Global Context in Text-Attributed Graphs | Jun 18, 2024 | Representation LearningToken Reduction | CodeCode Available | 1 |
| ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers | Jun 14, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs | Apr 16, 2024 | Long-Context UnderstandingToken Reduction | CodeCode Available | 1 |
| LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | Mar 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| FIT-RAG: Black-Box RAG with Factual Information and Token Reduction | Mar 21, 2024 | Open-Domain Question AnsweringQuestion Answering | —Unverified | 0 |
| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 |
| TPC-ViT: Token Propagation Controller for Efficient Vision Transformer | Jan 3, 2024 | Token Reduction | —Unverified | 0 |
| Which Tokens to Use? Investigating Token Reduction in Vision Transformers | Aug 9, 2023 | Classificationimage-classification | CodeCode Available | 1 |
| Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers | Jun 3, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 1 |
| PuMer: Pruning and Merging Tokens for Efficient Vision Language Models | May 27, 2023 | Token Reduction | CodeCode Available | 1 |
| Selective Structured State-Spaces for Long-Form Video Understanding | Mar 25, 2023 | Contrastive LearningForm | —Unverified | 0 |