SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 521530 of 1356 papers

TitleStatusHype
Extraction of nonlinearity in neural networks with Koopman operator0
Model Compression and Efficient Inference for Large Language Models: A Survey0
Bayesian Deep Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing0
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy0
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models0
Expediting In-Network Federated Learning by Voting-Based Consensus Model Compression0
The Potential of AutoML for Recommender Systems0
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes0
Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation0
A Survey on Transformer Compression0
Show:102550
← PrevPage 53 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified