SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 651675 of 1356 papers

TitleStatusHype
Activation Map Adaptation for Effective Knowledge Distillation0
Single-path Bit Sharing for Automatic Loss-aware Model Compression0
Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments0
Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation0
DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices0
AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent0
A Memory-Efficient Learning Framework for SymbolLevel Precoding with Quantized NN Weights0
Neural Epitome Search for Architecture-Agnostic Network Compression0
Deep Model Compression Via Two-Stage Deep Reinforcement Learning0
Deep Model Compression: Distilling Knowledge from Noisy Teachers0
Deep Model Compression based on the Training History0
A Web-Based Solution for Federated Learning with LLM-Based Automation0
AMD: Automatic Multi-step Distillation of Large-scale Vision Models0
Deep learning model compression using network sensitivity and gradients0
DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization0
AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates0
Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes0
Automatic Mixed-Precision Quantization Search of BERT0
Joint Regularization on Activations and Weights for Efficient Neural Network Pruning0
Joint Neural Architecture Search and Quantization0
Deep Collective Knowledge Distillation0
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration0
AMD: Adaptive Masked Distillation for Object Detection0
Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks0
It's always personal: Using Early Exits for Efficient On-Device CNN Personalisation0
Show:102550
← PrevPage 27 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified