SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 751760 of 1356 papers

TitleStatusHype
Comprehensive Knowledge Distillation with Causal InterventionCode1
Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations0
Aligned Structured Sparsity Learning for Efficient Image Super-ResolutionCode1
A Unified Pruning Framework for Vision TransformersCode1
FedHM: Efficient Federated Learning for Heterogeneous Models via Low-rank Factorization0
Exploring Low-Cost Transformer Model Compression for Large-Scale Commercial Reply Suggestions0
Accelerating Deep Learning with Dynamic Data Pruning0
NAM: Normalization-based Attention ModuleCode1
Sharpness-aware Quantization for Deep Neural NetworksCode1
Semi-Online Knowledge DistillationCode0
Show:102550
← PrevPage 76 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified