SOTAVerified

Token Reduction

Papers

Showing 2650 of 78 papers

TitleStatusHype
Which Tokens to Use? Investigating Token Reduction in Vision TransformersCode1
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision TransformersCode1
PuMer: Pruning and Merging Tokens for Efficient Vision Language ModelsCode1
AdaViT: Adaptive Tokens for Efficient Vision TransformerCode1
TR-BERT: Dynamic Token Reduction for Accelerating BERT InferenceCode1
Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration0
Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings0
Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers0
VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models0
The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training0
Not All Tokens Are What You Need In ThinkingCode0
Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning0
DRP: Distilled Reasoning Pruning with Skill-aware Step Decomposition for Efficient Large Reasoning Models0
STAR: Stage-Wise Attention-Guided Token Reduction for Efficient Large Vision-Language Models Inference0
Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT AccelerationCode0
EcoSafeRAG: Efficient Security through Context Analysis in Retrieval-Augmented Generation0
Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method0
ZipR1: Reinforcing Token Sparsity in MLLMs0
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs0
Dynamic Compressing Prompts for Efficient Inference of Large Language ModelsCode0
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features0
Local Information Matters: Inference Acceleration For Grounded Conversation Generation Models Through Adaptive Local-Aware Token Pruning0
Faster Parameter-Efficient Tuning with Token Redundancy ReductionCode0
Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models0
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers0
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.