SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 151200 of 1356 papers

TitleStatusHype
General Instance Distillation for Object DetectionCode1
An Information-Theoretic Justification for Model PruningCode1
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware TransformationCode1
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture SearchCode1
Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement LearningCode1
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature MatchingCode1
Improving Neural Network Efficiency via Post-Training Quantization With Adaptive Floating-PointCode1
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient DetectorsCode1
EarlyBERT: Efficient BERT Training via Early-bird Lottery TicketsCode1
Computation-Efficient Knowledge Distillation via Uncertainty-Aware MixupCode1
Neural Pruning via Growing RegularizationCode1
Progressive Network Grafting for Few-Shot Knowledge DistillationCode1
DE-RRD: A Knowledge Distillation Framework for Recommender SystemCode1
Going Beyond Classification Accuracy Metrics in Model CompressionCode1
Multi-level Knowledge Distillation via Knowledge Alignment and CorrelationCode1
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and QuantizationCode1
Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing SystemsCode1
HAWQV3: Dyadic Neural Network QuantizationCode1
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement LearningCode1
VEGA: Towards an End-to-End Configurable AutoML PipelineCode1
Passport-aware Normalization for Deep Model ProtectionCode1
CompRess: Self-Supervised Learning by Compressing RepresentationsCode1
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank DeterminationCode1
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's DistanceCode1
Contrastive Distillation on Intermediate Representations for Language Model CompressionCode1
Densely Guided Knowledge Distillation using Multiple Teacher AssistantsCode1
Implicit Regularization via Neural Feature AlignmentCode1
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble DistillationCode1
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer ProgrammingCode1
Knowledge Distillation Meets Self-SupervisionCode1
Communication-Computation Trade-Off in Resource-Constrained Edge InferenceCode1
Online Knowledge Distillation via Collaborative LearningCode1
Position-based Scaled Gradient for Model Quantization and PruningCode1
TinyLSTMs: Efficient Neural Speech Enhancement for Hearing AidsCode1
MicroNet for Efficient Language ModelingCode1
Data-Free Network Quantization With Adversarial Knowledge DistillationCode1
WoodFisher: Efficient Second-Order Approximation for Neural Network CompressionCode1
Training with Quantization Noise for Extreme Model CompressionCode1
KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflowCode1
Orthant Based Proximal Stochastic Gradient Method for _1-Regularized OptimizationCode1
Variational Bayesian QuantizationCode1
BERT-of-Theseus: Compressing BERT by Progressive Module ReplacingCode1
Discrimination-aware Network Pruning for Deep Model CompressionCode1
ZeroQ: A Novel Zero Shot Quantization FrameworkCode1
Learning from a Teacher using Unlabeled DataCode1
Contrastive Representation DistillationCode1
Compacting, Picking and Growing for Unforgetting Continual LearningCode1
Structured Pruning of Large Language ModelsCode1
Distilled Split Deep Neural Networks for Edge-Assisted Real-Time SystemsCode1
Global Sparse Momentum SGD for Pruning Very Deep Neural NetworksCode1
Show:102550
← PrevPage 4 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified