SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 331340 of 1356 papers

TitleStatusHype
Differential Privacy Meets Federated Learning under Communication Constraints0
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey0
AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent0
A Memory-Efficient Learning Framework for SymbolLevel Precoding with Quantized NN Weights0
A Web-Based Solution for Federated Learning with LLM-Based Automation0
AMD: Automatic Multi-step Distillation of Large-scale Vision Models0
AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates0
DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization0
AMD: Adaptive Masked Distillation for Object Detection0
Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks0
Show:102550
← PrevPage 34 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified