SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 251300 of 1356 papers

TitleStatusHype
Towards Efficient Model Compression via Learned Global RankingCode0
Learning Efficient Detector with Semi-supervised Adaptive DistillationCode0
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language BehindCode0
Compressed Object DetectionCode0
Learning Intrinsic Sparse Structures within Long Short-Term MemoryCode0
Canonical convolutional neural networksCode0
RanDeS: Randomized Delta Superposition for Multi-Model CompressionCode0
Learning Deep and Compact Models for Gesture RecognitionCode0
Light Multi-segment Activation for Model CompressionCode0
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroupCode0
Knowledge Translation: A New Pathway for Model CompressionCode0
Language Model Knowledge Distillation for Efficient Question Answering in SpanishCode0
Knowledge Distillation with Reptile Meta-Learning for Pretrained Language Model CompressionCode0
Class-dependent Compression of Deep Neural NetworksCode0
Boosting Large Language Models with Mask Fine-TuningCode0
Knowledge Distillation for Singing Voice DetectionCode0
Knowledge Grafting of Large Language ModelsCode0
Learning Accurate Performance Predictors for Ultrafast Automated Model CompressionCode0
An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUsCode0
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNsCode0
Knowledge Distillation as Semiparametric InferenceCode0
Iterative Filter Pruning for Concatenation-based CNN ArchitecturesCode0
JavaScript Convolutional Neural Networks for Keyword Spotting in the Browser: An Experimental AnalysisCode0
Binary Classification as a Phase Separation ProcessCode0
BinaryBERT: Pushing the Limit of BERT QuantizationCode0
Accelerating and Compressing Deep Neural Networks for Massive MIMO CSI FeedbackCode0
Knowledge Distillation for End-to-End Person SearchCode0
Learning Compression from Limited Unlabeled DataCode0
Occam Gradient DescentCode0
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMsCode0
Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM CompressionCode0
I3D: Transformer architectures with input-dependent dynamic depth for speech recognitionCode0
Actor-Mimic: Deep Multitask and Transfer Reinforcement LearningCode0
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and MemoryCode0
Image Classification with CondenseNeXt for ARM-Based Computing PlatformsCode0
InDistill: Information flow-preserving knowledge distillation for model compressionCode0
How does topology of neural architectures impact gradient propagation and model performance?Code0
Bayesian Tensorized Neural Networks with Automatic Rank SelectionCode0
High-fidelity 3D Model Compression based on Key SpheresCode0
Bayesian Optimization with Clustering and Rollback for CNN Auto PruningCode0
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model CompressionCode0
Group channel pruning and spatial attention distilling for object detectionCode0
A Miniaturized Semantic Segmentation Method for Remote Sensing ImageCode0
GSB: Group Superposition Binarization for Vision Transformer with Limited Training SamplesCode0
HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge DistillationCode0
Information-Theoretic Understanding of Population Risk Improvement with Model CompressionCode0
Generalizing Teacher Networks for Effective Knowledge Distillation Across Student ArchitecturesCode0
A Brief Review of Hypernetworks in Deep LearningCode0
AutoMC: Automated Model Compression based on Domain Knowledge and Progressive search strategyCode0
GASL: Guided Attention for Sparsity Learning in Deep Neural NetworksCode0
Show:102550
← PrevPage 6 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified