SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 51100 of 1356 papers

TitleStatusHype
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural NetworksCode1
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision ModelsCode1
Compression-Aware Video Super-ResolutionCode1
CompRess: Self-Supervised Learning by Compressing RepresentationsCode1
Designing Large Foundation Models for Efficient Training and Inference: A SurveyCode1
Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer InferenceCode1
Contrastive Distillation on Intermediate Representations for Language Model CompressionCode1
Contrastive Representation DistillationCode1
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank DeterminationCode1
AD-KD: Attribution-Driven Knowledge Distillation for Language Model CompressionCode1
ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of MultipliersCode1
Global Sparse Momentum SGD for Pruning Very Deep Neural NetworksCode1
Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated LearningCode1
Learning Efficient Vision Transformers via Fine-Grained Manifold DistillationCode1
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product OperatorsCode1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical MappingCode1
Efficient On-Device Session-Based RecommendationCode1
EarlyBERT: Efficient BERT Training via Early-bird Lottery TicketsCode1
Efficient and Robust Quantization-aware Training via Adaptive Coreset SelectionCode1
Dynamic Channel Pruning: Feature Boosting and SuppressionCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
Compacting, Picking and Growing for Unforgetting Continual LearningCode1
Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded DevicesCode1
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesCode1
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model CompressionCode1
Aligned Structured Sparsity Learning for Efficient Image Super-ResolutionCode1
Distilling Linguistic Context for Language Model CompressionCode1
Backdoor Attacks on Federated Learning with Lottery Ticket HypothesisCode1
Basic Binary Convolution Unit for Binarized Image Restoration NetworkCode1
Distilled Split Deep Neural Networks for Edge-Assisted Real-Time SystemsCode1
Dual Relation Knowledge Distillation for Object DetectionCode1
Bidirectional Distillation for Top-K Recommender SystemCode1
Distilling Object Detectors with Feature RichnessCode1
Dynamic Slimmable NetworkCode1
Activation-Informed Merging of Large Language ModelsCode1
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and BetterCode1
Model LEGO: Creating Models Like Disassembling and Assembling Building BlocksCode1
Discrimination-aware Channel Pruning for Deep Neural NetworksCode1
CHEX: CHannel EXploration for CNN Model CompressionCode1
Class Attention Transfer Based Knowledge DistillationCode1
A Unified Pruning Framework for Vision TransformersCode1
CoA: Towards Real Image Dehazing via Compression-and-AdaptationCode1
Communication-Computation Trade-Off in Resource-Constrained Edge InferenceCode1
Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID DataCode1
Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian SplattingCode1
Comprehensive Knowledge Distillation with Causal InterventionCode1
An Efficient Multilingual Language Model Compression through Vocabulary TrimmingCode1
Discrimination-aware Network Pruning for Deep Model CompressionCode1
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware TransformationCode1
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's DistanceCode1
Show:102550
← PrevPage 2 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified