SOTAVerified

Token Reduction

Papers

Showing 125 of 78 papers

TitleStatusHype
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to MultimodalityCode3
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language UnderstandingCode3
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object TrajectoryCode2
PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language ModelsCode2
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token PruningCode2
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language ModelsCode2
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token ReductionCode2
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal ModelsCode2
Learning Compact Vision Tokens for Efficient Large Multimodal ModelsCode1
SiLVR: A Simple Language-based Video Reasoning FrameworkCode1
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language ModelsCode1
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention MechanismsCode1
Streamline Without Sacrifice -- Squeeze out Computation Redundancy in LMMCode1
Window Token Concatenation for Efficient Visual Large Language ModelsCode1
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced PerformanceCode1
Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-trainingCode1
Enhancing Multimodal Large Language Models Complex Reason via Similarity ComputationCode1
Token Cropr: Faster ViTs for Quite a Few TasksCode1
Inference Optimal VLMs Need Fewer Visual Tokens and More ParametersCode1
Rethinking Token Reduction for State Space ModelsCode1
FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language ModelCode1
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMsCode1
Bridging Local Details and Global Context in Text-Attributed GraphsCode1
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision TransformersCode1
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMsCode1
Show:102550
← PrevPage 1 of 4Next →

No leaderboard results yet.