| Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image Recognition | Dec 31, 2024 | Fine-Grained Image RecognitionToken Reduction | CodeCode Available | 0 |
| ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition | Dec 21, 2024 | Efficient ViTsToken Reduction | —Unverified | 0 |
| Deploying Foundation Model Powered Agent Services: A Survey | Dec 18, 2024 | modelModel Compression | —Unverified | 0 |
| AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration | Dec 16, 2024 | DenoisingToken Reduction | —Unverified | 0 |
| Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers | Dec 13, 2024 | Token Reduction | CodeCode Available | 0 |
| TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation | Dec 10, 2024 | General KnowledgeText Generation | —Unverified | 0 |
| Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction | Nov 30, 2024 | Bayesian OptimizationToken Reduction | —Unverified | 0 |
| Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration | Nov 26, 2024 | Token Reduction | —Unverified | 0 |
| Efficient Multi-modal Large Language Models via Visual Token Grouping | Nov 26, 2024 | Image CaptioningQuestion Answering | —Unverified | 0 |
| freePruner: A Training-free Approach for Large Multimodal Model Acceleration | Nov 23, 2024 | QuantizationQuestion Answering | —Unverified | 0 |