SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 326350 of 1356 papers

TitleStatusHype
Bayesian Federated Model Compression for Communication and Computation Efficiency0
Bayesian Deep Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing0
A Model Compression Method with Matrix Product Operators for Speech Enhancement0
A Mixed Integer Programming Approach for Verifying Properties of Binarized Neural Networks0
Balancing Specialization, Generalization, and Compression for Detection and Tracking0
Balancing Cost and Benefit with Tied-Multi Transformers0
Activation Map Adaptation for Effective Knowledge Distillation0
Single-path Bit Sharing for Automatic Loss-aware Model Compression0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications0
Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression0
Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation0
A Memory-Efficient Learning Framework for SymbolLevel Precoding with Quantized NN Weights0
AMD: Automatic Multi-step Distillation of Large-scale Vision Models0
Deep Model Compression Via Two-Stage Deep Reinforcement Learning0
Deep Model Compression: Distilling Knowledge from Noisy Teachers0
Deep Model Compression based on the Training History0
A Web-Based Solution for Federated Learning with LLM-Based Automation0
Discrete Model Compression With Resource Constraint for Deep Neural Networks0
AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates0
Deep learning model compression using network sensitivity and gradients0
Neural Epitome Search for Architecture-Agnostic Network Compression0
AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent0
DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices0
DiPaCo: Distributed Path Composition0
AMD: Adaptive Masked Distillation for Object Detection0
Show:102550
← PrevPage 14 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified