SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 101150 of 1356 papers

TitleStatusHype
Densely Guided Knowledge Distillation using Multiple Teacher AssistantsCode1
An Empirical Study of CLIP for Text-based Person SearchCode1
Discovering Dynamic Patterns from Spatiotemporal Data with Time-Varying Low-Rank AutoregressionCode1
Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural NetworksCode1
Discrimination-aware Channel Pruning for Deep Neural NetworksCode1
Discrimination-aware Network Pruning for Deep Model CompressionCode1
Generative Model-based Feature Knowledge Distillation for Action RecognitionCode1
An Information Theory-inspired Strategy for Automatic Network PruningCode1
Distilling Linguistic Context for Language Model CompressionCode1
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model FinetuningCode1
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and TransformersCode1
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and QuantizationCode1
Dual Relation Knowledge Distillation for Object DetectionCode1
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model GeneralizationCode1
Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded DevicesCode1
Dynamic Slimmable NetworkCode1
HiNeRV: Video Compression with Hierarchical Encoding-based Neural RepresentationCode1
MobileNMT: Enabling Translation in 15MB and 30msCode1
Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural NetworksCode1
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and BetterCode1
AD-KD: Attribution-Driven Knowledge Distillation for Language Model CompressionCode1
A Real-time Low-cost Artificial Intelligence System for Autonomous Spraying in Palm PlantationsCode1
FFNeRV: Flow-Guided Frame-Wise Neural Representations for VideosCode1
Efficient On-Device Session-Based RecommendationCode1
Forget the Data and Fine-Tuning! Just Fold the Network to CompressCode1
Efficient and Robust Quantization-aware Training via Adaptive Coreset SelectionCode1
Fast Vocabulary Transfer for Language Model CompressionCode1
CHEX: CHannel EXploration for CNN Model CompressionCode1
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware TransformationCode1
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement LearningCode1
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's DistanceCode1
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model CompressionCode1
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesCode1
Backdoor Attacks on Federated Learning with Lottery Ticket HypothesisCode1
Basic Binary Convolution Unit for Binarized Image Restoration NetworkCode1
Bidirectional Distillation for Top-K Recommender SystemCode1
Faster and Lighter LLMs: A Survey on Current Challenges and Way ForwardCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
FedUKD: Federated UNet Model with Knowledge Distillation for Land Use Classification from Satellite and Street ViewsCode1
A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor FusionCode1
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix ApproximationCode1
BERT-of-Theseus: Compressing BERT by Progressive Module ReplacingCode1
Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated LearningCode1
Model LEGO: Creating Models Like Disassembling and Assembling Building BlocksCode1
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision ModelsCode1
Communication-Computation Trade-Off in Resource-Constrained Edge InferenceCode1
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary SearchCode1
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product OperatorsCode1
Compacting, Picking and Growing for Unforgetting Continual LearningCode1
A Unified Pruning Framework for Vision TransformersCode1
Show:102550
← PrevPage 3 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified