SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 201250 of 1356 papers

TitleStatusHype
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture SearchCode1
Forget the Data and Fine-Tuning! Just Fold the Network to CompressCode1
General Instance Distillation for Object DetectionCode1
Gaussian RAM: Lightweight Image Classification via Stochastic Retina-Inspired Glimpse and Reinforcement LearningCode1
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model GeneralizationCode1
Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural NetworksCode1
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D DetectionCode1
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving GeneralizationCode1
UPop: Unified and Progressive Pruning for Compressing Vision-Language TransformersCode1
Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and AlgorithmsCode0
Learning Intrinsic Sparse Structures within Long Short-Term MemoryCode0
Are Compressed Language Models Less Subgroup Robust?Code0
Learning Deep and Compact Models for Gesture RecognitionCode0
Learning Efficient Detector with Semi-supervised Adaptive DistillationCode0
Learning Accurate Performance Predictors for Ultrafast Automated Model CompressionCode0
Language Model Knowledge Distillation for Efficient Question Answering in SpanishCode0
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroupCode0
Learning Compression from Limited Unlabeled DataCode0
Towards Efficient Model Compression via Learned Global RankingCode0
A Programmable Approach to Neural Network CompressionCode0
Privacy and Accuracy Implications of Model Complexity and Integration in Heterogeneous Federated LearningCode0
Knowledge Distillation for Singing Voice DetectionCode0
Knowledge Distillation as Semiparametric InferenceCode0
Knowledge Distillation for End-to-End Person SearchCode0
Knowledge Distillation with Reptile Meta-Learning for Pretrained Language Model CompressionCode0
Application Specific Compression of Deep Learning ModelsCode0
JavaScript Convolutional Neural Networks for Keyword Spotting in the Browser: An Experimental AnalysisCode0
Is Modularity Transferable? A Case Study through the Lens of Knowledge DistillationCode0
Iterative Filter Pruning for Concatenation-based CNN ArchitecturesCode0
InDistill: Information flow-preserving knowledge distillation for model compressionCode0
Information-Theoretic Understanding of Population Risk Improvement with Model CompressionCode0
PruMUX: Augmenting Data Multiplexing with Model CompressionCode0
Knowledge Grafting of Large Language ModelsCode0
I3D: Transformer architectures with input-dependent dynamic depth for speech recognitionCode0
Chemical transformer compression for accelerating both training and inference of molecular modelingCode0
Annealing Knowledge DistillationCode0
Image Classification with CondenseNeXt for ARM-Based Computing PlatformsCode0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge DistillationCode0
Change Is the Only Constant: Dynamic LLM Slicing based on Layer RedundancyCode0
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-DesignCode0
Knowledge Translation: A New Pathway for Model CompressionCode0
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model CompressionCode0
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and MemoryCode0
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMsCode0
Causal Explanation of Convolutional Neural NetworksCode0
CASP: Compression of Large Multimodal Models Based on Attention SparsityCode0
An exploration of the effect of quantisation on energy consumption and inference time of StarCoder2Code0
Comprehensive SNN Compression Using ADMM Optimization and Activity RegularizationCode0
Bayesian Optimization with Clustering and Rollback for CNN Auto PruningCode0
Show:102550
← PrevPage 5 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified