SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 5160 of 1356 papers

TitleStatusHype
Knowledge Distillation with Refined LogitsCode1
Composable Interventions for Language ModelsCode1
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer MergingCode1
LiteYOLO-ID: A Lightweight Object Detection Network for Insulator Defect DetectionCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Transferable and Principled Efficiency for Open-Vocabulary SegmentationCode1
Streamlining Redundant Layers to Compress Large Language ModelsCode1
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task AdaptationCode1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic HashingCode1
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel ApproachCode1
Show:102550
← PrevPage 6 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified