SOTAVerified

Token Reduction

Papers

Showing 5175 of 78 papers

TitleStatusHype
Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning0
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training0
Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models0
TPC-ViT: Token Propagation Controller for Efficient Vision Transformer0
Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration0
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer0
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings0
TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation0
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers0
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer0
VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models0
Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction0
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration0
PAR: Prompt-Aware Token Reduction Method for Efficient Large Multimodal Models0
Selective Structured State-Spaces for Long-Form Video Understanding0
Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study0
STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference0
Not All Tokens Are What You Need In ThinkingCode0
Learning to Merge Tokens via Decoupled Embedding for Efficient Vision TransformersCode0
Faster Parameter-Efficient Tuning with Token Redundancy ReductionCode0
Rethinking Token Reduction with Parameter-Efficient Fine-Tuning in ViT for Pixel-Level TasksCode0
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt CompressionCode0
Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image RecognitionCode0
Dynamic Compressing Prompts for Efficient Inference of Large Language ModelsCode0
Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 TokensCode0
Show:102550
← PrevPage 3 of 4Next →

No leaderboard results yet.