SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 531540 of 1356 papers

TitleStatusHype
Inferring ECG from PPG for Continuous Cardiac Monitoring Using Lightweight Neural Network0
FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks0
Cross-Channel Intragroup Sparsity Neural Network0
Croesus: Multi-Stage Processing and Transactions for Video-Analytics in Edge-Cloud Systems0
Attention Sinks and Outlier Features: A 'Catch, Tag, and Release' Mechanism for Embeddings0
Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices0
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models0
ALF: Autoencoder-based Low-rank Filter-sharing for Efficient Convolutional Neural Networks0
AACP: Model Compression by Accurate and Automatic Channel Pruning0
FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks0
Show:102550
← PrevPage 54 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified