SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 181190 of 1356 papers

TitleStatusHype
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language UnderstandingCode1
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and TransformersCode1
Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded DevicesCode1
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's DistanceCode1
BERT-of-Theseus: Compressing BERT by Progressive Module ReplacingCode1
Dynamic Slimmable NetworkCode1
Passport-aware Normalization for Deep Model ProtectionCode1
EarlyBERT: Efficient BERT Training via Early-bird Lottery TicketsCode1
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and BetterCode1
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer ProgrammingCode1
Show:102550
← PrevPage 19 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified