SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 2650 of 1356 papers

TitleStatusHype
Fast convolutional neural networks on FPGAs with hls4mlCode2
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New OutlooksCode2
Well-Read Students Learn Better: On the Importance of Pre-training Compact ModelsCode2
AMC: AutoML for Model Compression and Acceleration on Mobile DevicesCode2
Data-Free Knowledge Distillation for Deep Neural NetworksCode2
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix ApproximationCode1
Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian SplattingCode1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical MappingCode1
Forget the Data and Fine-Tuning! Just Fold the Network to CompressCode1
DarwinLM: Evolutionary Structured Pruning of Large Language ModelsCode1
Activation-Informed Merging of Large Language ModelsCode1
A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor FusionCode1
Merging Feed-Forward Sublayers for Compressed TransformersCode1
CoA: Towards Real Image Dehazing via Compression-and-AdaptationCode1
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LNCode1
LLMCBench: Benchmarking Large Language Model Compression for Efficient DeploymentCode1
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary SearchCode1
SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight CompressionCode1
QT-DoG: Quantization-aware Training for Domain GeneralizationCode1
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model CompressionCode1
Search for Efficient Large Language ModelsCode1
Designing Large Foundation Models for Efficient Training and Inference: A SurveyCode1
Hyper-Compression: Model Compression via HyperfunctionCode1
Localize-and-Stitch: Efficient Model Merging via Sparse Task ArithmeticCode1
Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and TransformersCode1
Show:102550
← PrevPage 2 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified