SOTAVerified

Token Reduction

Papers

Showing 2650 of 78 papers

TitleStatusHype
Local Information Matters: Inference Acceleration For Grounded Conversation Generation Models Through Adaptive Local-Aware Token Pruning0
Faster Parameter-Efficient Tuning with Token Redundancy ReductionCode0
Token Dynamics: Towards Efficient and Dynamic Video Token Representation for Video Large Language Models0
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers0
Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 TokensCode0
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token PruningCode2
Does Acceleration Cause Hidden Instability in Vision Language Models? Uncovering Instance-Level Divergence Through a Large-Scale Empirical Study0
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt CompressionCode0
Knowing When to Stop: Dynamic Context Cutoff for Large Language Models0
MINT: Mitigating Hallucinations in Large Vision-Language Models via Token Reduction0
Learning Free Token Reduction for Multi-Modal Large Language Models0
Dynamic Token Reduction during Generation for Vision Language Models0
AdaFV: Rethinking of Visual-Language alignment for VLM acceleration0
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced PerformanceCode1
Rethinking Token Reduction with Parameter-Efficient Fine-Tuning in ViT for Pixel-Level TasksCode0
Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion ModelCode0
Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image RecognitionCode0
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language ModelsCode2
ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition0
Deploying Foundation Model Powered Agent Services: A Survey0
Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-trainingCode1
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration0
Learning to Merge Tokens via Decoupled Embedding for Efficient Vision TransformersCode0
Enhancing Multimodal Large Language Models Complex Reason via Similarity ComputationCode1
TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation0
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.