SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 826850 of 1356 papers

TitleStatusHype
Random Offset Block Embedding Array (ROBE) for CriteoTB Benchmark MLPerf DLRM Model : 1000 Compression and 3.1 Faster Inference0
Learning a Neural Diff for Speech Models0
QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning0
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework0
Pruning Ternary Quantization0
A New Clustering-Based Technique for the Acceleration of Deep Convolutional Networks0
Accelerating deep neural networks for efficient scene understanding in automotive cyber-physical systems0
Federated Action Recognition on Heterogeneous Embedded Devices0
Efficient automated U-Net based tree crown delineation using UAV multi-spectral imagery on embedded devices0
Compact and Optimal Deep Learning with Recurrent Parameter GeneratorsCode0
Model compression as constrained optimization, with application to neural nets. Part V: combining compressions0
WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations0
A Light-weight Deep Human Activity Recognition Algorithm Using Multi-knowledge Distillation0
Universal approximation and model compression for radial neural networksCode0
Investigation of Practical Aspects of Single Channel Speech Separation for ASR0
A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification0
Learning Efficient Vision Transformers via Fine-Grained Manifold DistillationCode1
Pool of Experts: Realtime Querying Specialized Knowledge in Massive Neural NetworksCode0
Exact Backpropagation in Binary Weighted Networks with Group Weight TransformationsCode0
Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams0
Image Classification with CondenseNeXt for ARM-Based Computing PlatformsCode0
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation0
Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner0
Network Pruning via Performance MaximizationCode0
Learning Student Networks in the WildCode2
Show:102550
← PrevPage 34 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified