| FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance | Jan 5, 2025 | Token Reduction | CodeCode Available | 1 | 5 |
| Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs | Sep 17, 2024 | Question AnsweringToken Reduction | CodeCode Available | 1 | 5 |
| Learning Compact Vision Tokens for Efficient Large Multimodal Models | Jun 8, 2025 | Multimodal ReasoningToken Reduction | CodeCode Available | 1 | 5 |
| Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters | Nov 5, 2024 | Token ReductionVisual Reasoning | CodeCode Available | 1 | 5 |
| AdaViT: Adaptive Tokens for Efficient Vision Transformer | Dec 14, 2021 | Efficient ViTsimage-classification | CodeCode Available | 1 | 5 |
| Dynamic Compressing Prompts for Efficient Inference of Large Language Models | Apr 15, 2025 | Token Reduction | CodeCode Available | 0 | 5 |
| Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration | May 16, 2025 | DenoisingToken Reduction | CodeCode Available | 0 | 5 |
| BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression | Mar 4, 2025 | Large Language ModelMachine Translation | CodeCode Available | 0 | 5 |
| Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model | Jan 1, 2025 | DenoisingToken Reduction | CodeCode Available | 0 | 5 |
| Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image Recognition | Dec 31, 2024 | Fine-Grained Image RecognitionToken Reduction | CodeCode Available | 0 | 5 |
| Faster Parameter-Efficient Tuning with Token Redundancy Reduction | Mar 26, 2025 | Token Reduction | CodeCode Available | 0 | 5 |
| HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Jan 10, 2024 | Action RecognitionAction Recognition In Videos | CodeCode Available | 0 | 5 |
| Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens | Mar 11, 2025 | DecoderImage Generation | CodeCode Available | 0 | 5 |
| Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers | Dec 13, 2024 | Token Reduction | CodeCode Available | 0 | 5 |
| Not All Tokens Are What You Need In Thinking | May 23, 2025 | AllToken Reduction | CodeCode Available | 0 | 5 |
| Rethinking Token Reduction with Parameter-Efficient Fine-Tuning in ViT for Pixel-Level Tasks | Jan 1, 2025 | Computational EfficiencyDiversity | CodeCode Available | 0 | 5 |
| Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers | Jun 5, 2025 | GPUText-to-Video Generation | —Unverified | 0 | 0 |
| Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method | May 12, 2025 | Semantic CompressionSemantic Similarity | —Unverified | 0 | 0 |
| freePruner: A Training-free Approach for Large Multimodal Model Acceleration | Nov 23, 2024 | QuantizationQuestion Answering | —Unverified | 0 | 0 |
| Local Information Matters: Inference Acceleration For Grounded Conversation Generation Models Through Adaptive Local-Aware Token Pruning | Mar 31, 2025 | Semantic SegmentationToken Reduction | —Unverified | 0 | 0 |
| FIT-RAG: Black-Box RAG with Factual Information and Token Reduction | Mar 21, 2024 | Open-Domain Question AnsweringQuestion Answering | —Unverified | 0 | 0 |
| MINT: Mitigating Hallucinations in Large Vision-Language Models via Token Reduction | Feb 2, 2025 | HallucinationToken Reduction | —Unverified | 0 | 0 |
| AdaFV: Rethinking of Visual-Language alignment for VLM acceleration | Jan 16, 2025 | Token Reduction | —Unverified | 0 | 0 |
| Efficient Multi-modal Large Language Models via Visual Token Grouping | Nov 26, 2024 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features | Apr 1, 2025 | Token Reduction | —Unverified | 0 | 0 |