SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 361370 of 1356 papers

TitleStatusHype
Bayesian Deep Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing0
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy0
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models0
The Potential of AutoML for Recommender Systems0
Expediting In-Network Federated Learning by Voting-Based Consensus Model Compression0
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes0
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective FinetuningCode2
A Survey on Transformer Compression0
Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation0
Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models0
Show:102550
← PrevPage 37 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified