SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 9911000 of 1356 papers

TitleStatusHype
Aligned Weight Regularizers for Pruning Pretrained Neural Networks0
Post-Training Quantization for Video Matting0
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs0
Post-Training Weighted Quantization of Neural Networks for Language Models0
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation0
Practical quantum federated learning and its experimental demonstration0
Precise Box Score: Extract More Information from Datasets to Improve the Performance of Face Detection0
What do larger image classifiers memorise?0
Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data0
Preview-based Category Contrastive Learning for Knowledge Distillation0
Show:102550
← PrevPage 100 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified