SOTAVerified

Efficient ViTs

Increasing the efficiency of ViTs without the modification of the architecture. (i.e., Key & Query Sparsification, Token pruning & merging)

Papers

Showing 132 of 32 papers

TitleStatusHype
Token Merging: Your ViT But FasterCode3
Fast Vision Transformers with HiLo AttentionCode2
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision TransformersCode1
GTP-ViT: Efficient Vision Transformers via Graph-based Token PropagationCode1
PPT: Token Pruning and Pooling for Efficient Vision TransformersCode1
Learned Thresholds Token Merging and Pruning for Vision TransformersCode1
MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation DatasetsCode1
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision TransformerCode1
DiffRate : Differentiable Compression Rate for Efficient Vision TransformersCode1
Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision TransformersCode1
Making Vision Transformers Efficient from A Token Sparsification ViewCode1
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer InferenceCode1
Adaptive Sparse ViT: Towards Learnable Adaptive Token Pruning by Fully Exploiting Self-AttentionCode1
Not All Patches are What You Need: Expediting Vision Transformers via Token ReorganizationsCode1
SPViT: Enabling Faster Vision Transformers via Soft Token PruningCode1
AdaViT: Adaptive Tokens for Efficient Vision TransformerCode1
Adaptive Token Sampling For Efficient Vision TransformersCode1
Pruning Self-attentions into Convolutional Layers in Single PathCode1
Global Vision Transformer Pruning with Hessian-Aware SaliencyCode1
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision TransformerCode1
Chasing Sparsity in Vision Transformers: An End-to-End ExplorationCode1
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationCode1
All Tokens Matter: Token Labeling for Training Better Vision TransformersCode1
Scalable Vision Transformers with Hierarchical PoolingCode1
Training data-efficient image transformers & distillation through attentionCode1
ImagePiece: Content-aware Re-tokenization for Efficient Image Recognition0
M^2-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization0
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision TransformerCode0
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT0
Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision TransformersCode0
IA-RED^2: Interpretability-Aware Redundancy Reduction for Vision Transformers0
Patch Slimming for Efficient Vision Transformers0
Show:102550

No leaderboard results yet.