SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 11761200 of 1356 papers

TitleStatusHype
Compositionality Unlocks Deep Interpretable Models0
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT0
Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks0
Compressing Pre-trained Language Models by Matrix Decomposition0
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging0
Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition0
Spirit Distillation: A Model Compression Method with Multi-domain Knowledge Transfer0
Sponge Attacks on Sensing AI: Energy-Latency Vulnerabilities and Defense via Model Pruning0
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting0
Compression and Localization in Reinforcement Learning for ATARI Games0
Activation Map Adaptation for Effective Knowledge Distillation0
Complexity-Driven CNN Compression for Resource-constrained Edge AI0
Compression for Better: A General and Stable Lossless Compression Framework0
Compression Laws for Large Language Models0
Compression of Deep Neural Networks by combining pruning and low rank decomposition0
Compression of Deep Neural Networks for Image Instance Retrieval0
Compression of Generative Pre-trained Language Models via Quantization0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt0
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead0
Computation-efficient Deep Learning for Computer Vision: A Survey0
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks0
Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks0
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval0
Conditional Automated Channel Pruning for Deep Neural Networks0
Show:102550
← PrevPage 48 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified