SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 451460 of 1356 papers

TitleStatusHype
Training dynamic models using early exits for automatic speech recognition on resource-constrained devicesCode0
Pruning Large Language Models via Accuracy Predictor0
Two-Step Knowledge Distillation for Tiny Speech Enhancement0
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders0
Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization0
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models0
Compressing Vision Transformers for Low-Resource Visual LearningCode0
ADC/DAC-Free Analog Acceleration of Deep Neural Networks with Frequency Transformation0
Uncovering the Hidden Cost of Model CompressionCode0
Computation-efficient Deep Learning for Computer Vision: A Survey0
Show:102550
← PrevPage 46 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified