SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 101150 of 1356 papers

TitleStatusHype
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and PruningCode1
Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID DataCode1
3DG-STFM: 3D Geometric Guided Student-Teacher Feature MatchingCode1
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN ExecutionCode1
DiSparse: Disentangled Sparsification for Multitask Model CompressionCode1
Towards Efficient 3D Object Detection with Knowledge DistillationCode1
RLx2: Training a Sparse Deep Reinforcement Learning Model from ScratchCode1
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D DetectionCode1
Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image RetrievalCode1
Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse NetworkCode1
Structured Pruning Learns Compact and Accurate ModelsCode1
CHEX: CHannel EXploration for CNN Model CompressionCode1
Model LEGO: Creating Models Like Disassembling and Assembling Building BlocksCode1
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and QuantizationCode1
Memory-Efficient Backpropagation through Large Linear LayersCode1
SPDY: Accurate Pruning with Speedup GuaranteesCode1
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural NetworksCode1
SPViT: Enabling Faster Vision Transformers via Soft Token PruningCode1
Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image RecognitionCode1
Comprehensive Knowledge Distillation with Causal InterventionCode1
Aligned Structured Sparsity Learning for Efficient Image Super-ResolutionCode1
A Unified Pruning Framework for Vision TransformersCode1
NAM: Normalization-based Attention ModuleCode1
Sharpness-aware Quantization for Deep Neural NetworksCode1
LiMuSE: Lightweight Multi-modal Speaker ExtractionCode1
Distilling Object Detectors with Feature RichnessCode1
Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural NetworksCode1
Joint Channel and Weight Pruning for Model Acceleration on Moblie DevicesCode1
Backdoor Attacks on Federated Learning with Lottery Ticket HypothesisCode1
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and TransformersCode1
Distilling Linguistic Context for Language Model CompressionCode1
The NiuTrans System for WNGT 2020 Efficiency TaskCode1
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language UnderstandingCode1
An Information Theory-inspired Strategy for Automatic Network PruningCode1
Learning Efficient Vision Transformers via Fine-Grained Manifold DistillationCode1
A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution RobustnessCode1
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and BetterCode1
ModelDiff: Testing-Based DNN Similarity Comparison for Model Reuse DetectionCode1
Bidirectional Distillation for Top-K Recommender SystemCode1
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product OperatorsCode1
You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature GradientCode1
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving GeneralizationCode1
Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated LearningCode1
Initialization and Regularization of Factorized Neural LayersCode1
Skip-Convolutions for Efficient Video ProcessingCode1
Differentiable Model Compression via Pseudo Quantization NoiseCode1
Deep Compression for PyTorch Model Deployment on MicrocontrollersCode1
Dynamic Slimmable NetworkCode1
A Real-time Low-cost Artificial Intelligence System for Autonomous Spraying in Palm PlantationsCode1
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesCode1
Show:102550
← PrevPage 3 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified