SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 451475 of 1356 papers

TitleStatusHype
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
Efficient Memory Management for GPU-based Deep Learning Systems0
Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks0
Efficient Model Compression for Hierarchical Federated Learning0
Efficient Model Compression Techniques with FishLeg0
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization0
Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices0
Data-Independent Structured Pruning of Neural Networks via Coresets0
Automated Model Compression by Jointly Applied Pruning and Quantization0
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion0
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models0
Data-Free Quantization via Pseudo-label Filtering0
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning0
Efficient Speech Representation Learning with Low-Bit Quantization0
Automated Inference of Graph Transformation Rules0
Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model Compression0
Data-Free Knowledge Transfer: A Survey0
Auto Graph Encoder-Decoder for Neural Network Pruning0
A Low-Power Streaming Speech Enhancement Accelerator For Edge Devices0
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models0
Acoustic Model Compression with MAP adaptation0
Communication-Efficient Distributed Online Learning with Kernels0
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images0
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models0
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models0
Show:102550
← PrevPage 19 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified