SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 271280 of 1356 papers

TitleStatusHype
A New Clustering-Based Technique for the Acceleration of Deep Convolutional Networks0
CURing Large Models: Compression via CUR Decomposition0
Can Students Outperform Teachers in Knowledge Distillation based Model Compression?0
Can Students Beyond The Teacher? Distilling Knowledge from Teacher's Bias0
A "Network Pruning Network" Approach to Deep Model Compression0
An Empirical Study of Low Precision Quantization for TinyML0
Can Model Compression Improve NLP Fairness0
Heterogeneous Federated Learning using Dynamic Model Pruning and Adaptive Gradient0
2-bit Model Compression of Deep Convolutional Neural Network on ASIC Engine for Image Retrieval0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
Show:102550
← PrevPage 28 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified