SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 11511175 of 1356 papers

TitleStatusHype
Small Language Models: Architectures, Techniques, Evaluation, Problems and Future Adaptation0
Small Object Detection Based on Modified FSSD and Model Compression0
Smart Environmental Monitoring of Marine Pollution using Edge AI0
SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation0
Smooth Model Compression without Fine-Tuning0
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation0
Soft Labeling Affects Out-of-Distribution Detection of Deep Neural Networks0
Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge0
SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching0
Sparse Deep Learning for Time Series Data: Theory and Applications0
AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles0
Sparse Unbalanced GAN Training with In-Time Over-Parameterization0
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks0
Activation Sparsity Opportunities for Compressing General Large Language Models0
Compressible Spectral Mixture Kernels with Sparse Dependency Structures for Gaussian Processes0
Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error0
Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models0
Comprehensive Survey of Model Compression and Speed up for Vision Transformers0
Speeding up Convolutional Neural Networks with Low Rank Expansions0
Compressed models are NOT miniature versions of large models0
Speeding Up Image Classifiers with Little Companions0
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models0
Compressing Cross-Lingual Multi-Task Models at Qualtrics0
Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters0
Compressing Deep Neural Networks via Layer Fusion0
Show:102550
← PrevPage 47 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified