SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 11911200 of 1356 papers

TitleStatusHype
Compression of Deep Neural Networks for Image Instance Retrieval0
Compression of Generative Pre-trained Language Models via Quantization0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt0
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead0
Computation-efficient Deep Learning for Computer Vision: A Survey0
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks0
Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks0
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval0
Conditional Automated Channel Pruning for Deep Neural Networks0
Show:102550
← PrevPage 120 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified