SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 451500 of 1356 papers

TitleStatusHype
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
Efficient Memory Management for GPU-based Deep Learning Systems0
MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors0
Efficient Model Compression for Hierarchical Federated Learning0
Efficient Model Compression Techniques with FishLeg0
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization0
Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks0
FedCode: Communication-Efficient Federated Learning via Transferring Codebooks0
Decoupling Weight Regularization from Batch Size for Model Compression0
Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion0
Debiased Distillation by Transplanting the Last Layer0
Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks0
Efficient Recurrent Neural Networks using Structured Matrices in FPGAs0
Efficient Speech Representation Learning with Low-Bit Quantization0
Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices0
Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model Compression0
Data-Independent Structured Pruning of Neural Networks via Coresets0
Automated Model Compression by Jointly Applied Pruning and Quantization0
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models0
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models0
ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks0
Communication-Efficient Distributed Online Learning with Kernels0
Data-Free Quantization via Pseudo-label Filtering0
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models0
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications0
Enabling All In-Edge Deep Learning: A Literature Review0
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning0
Automated Inference of Graph Transformation Rules0
Data-Free Knowledge Transfer: A Survey0
Auto Graph Encoder-Decoder for Neural Network Pruning0
A Low-Power Streaming Speech Enhancement Accelerator For Edge Devices0
Acoustic Model Compression with MAP adaptation0
Enhanced Sparsification via Stimulative Training0
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations0
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models0
Enhancing Targeted Attack Transferability via Diversified Weight Pruning0
A Low Effort Approach to Structured CNN Design Using PCA0
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification0
FASTNav: Fine-tuned Adaptive Small-language-models Trained for Multi-point Robot Navigation0
EPIM: Efficient Processing-In-Memory Accelerators based on Epitome0
Data-Free Distillation of Language Model by Text-to-Text Transfer0
Error-aware Quantization through Noise Tempering0
26ms Inference Time for ResNet-50: Towards Real-Time Execution of all DNNs on Smartphone0
Data-Free Adversarial Knowledge Distillation for Graph Neural Networks0
AutoBSS: An Efficient Algorithm for Block Stacking Style Search0
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes0
FedNILM: Applying Federated Learning to NILM Applications at the Edge0
Data-Driven Compression of Convolutional Neural Networks0
Show:102550
← PrevPage 10 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified