SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 176200 of 1356 papers

TitleStatusHype
Learned Step Size QuantizationCode1
Contrastive Distillation on Intermediate Representations for Language Model CompressionCode1
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model CompressionCode1
Comprehensive Knowledge Distillation with Causal InterventionCode1
Composable Interventions for Language ModelsCode1
LiMuSE: Lightweight Multi-modal Speaker ExtractionCode1
LiteYOLO-ID: A Lightweight Object Detection Network for Insulator Defect DetectionCode1
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel ApproachCode1
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's DistanceCode1
Streamlining Redundant Layers to Compress Large Language ModelsCode1
Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse NetworkCode1
Merging Feed-Forward Sublayers for Compressed TransformersCode1
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model GeneralizationCode1
Compacting, Picking and Growing for Unforgetting Continual LearningCode1
Bidirectional Distillation for Top-K Recommender SystemCode1
Contrastive Representation DistillationCode1
DiSparse: Disentangled Sparsification for Multitask Model CompressionCode1
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and PruningCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
An Efficient Multilingual Language Model Compression through Vocabulary TrimmingCode1
Passport-aware Normalization for Deep Model ProtectionCode1
Performance-aware Approximation of Global Channel Pruning for Multitask CNNsCode1
Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image RecognitionCode1
Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated LearningCode1
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement LearningCode1
Show:102550
← PrevPage 8 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified