SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 191200 of 1356 papers

TitleStatusHype
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product OperatorsCode1
Faster and Lighter LLMs: A Survey on Current Challenges and Way ForwardCode1
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank DeterminationCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
Improving Neural Network Efficiency via Post-Training Quantization With Adaptive Floating-PointCode1
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesCode1
Initialization and Regularization of Factorized Neural LayersCode1
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary SearchCode1
Accurate Retraining-free Pruning for Pretrained Encoder-based Language ModelsCode1
Memory-Efficient Backpropagation through Large Linear LayersCode1
Show:102550
← PrevPage 20 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified