SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 11511200 of 1356 papers

TitleStatusHype
Small Language Models: Architectures, Techniques, Evaluation, Problems and Future Adaptation0
Small Object Detection Based on Modified FSSD and Model Compression0
Smart Environmental Monitoring of Marine Pollution using Edge AI0
SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation0
Smooth Model Compression without Fine-Tuning0
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation0
Soft Labeling Affects Out-of-Distribution Detection of Deep Neural Networks0
Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge0
SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching0
Sparse Deep Learning for Time Series Data: Theory and Applications0
AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles0
Sparse Unbalanced GAN Training with In-Time Over-Parameterization0
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks0
Activation Sparsity Opportunities for Compressing General Large Language Models0
Compressible Spectral Mixture Kernels with Sparse Dependency Structures for Gaussian Processes0
Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error0
Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models0
Comprehensive Survey of Model Compression and Speed up for Vision Transformers0
Speeding up Convolutional Neural Networks with Low Rank Expansions0
Compressed models are NOT miniature versions of large models0
Speeding Up Image Classifiers with Little Companions0
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models0
Compressing Cross-Lingual Multi-Task Models at Qualtrics0
Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters0
Compressing Deep Neural Networks via Layer Fusion0
Compositionality Unlocks Deep Interpretable Models0
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT0
Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks0
Compressing Pre-trained Language Models by Matrix Decomposition0
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging0
Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition0
Spirit Distillation: A Model Compression Method with Multi-domain Knowledge Transfer0
Sponge Attacks on Sensing AI: Energy-Latency Vulnerabilities and Defense via Model Pruning0
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting0
Compression and Localization in Reinforcement Learning for ATARI Games0
Activation Map Adaptation for Effective Knowledge Distillation0
Complexity-Driven CNN Compression for Resource-constrained Edge AI0
Compression for Better: A General and Stable Lossless Compression Framework0
Compression Laws for Large Language Models0
Compression of Deep Neural Networks by combining pruning and low rank decomposition0
Compression of Deep Neural Networks for Image Instance Retrieval0
Compression of Generative Pre-trained Language Models via Quantization0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt0
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead0
Computation-efficient Deep Learning for Computer Vision: A Survey0
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks0
Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks0
ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval0
Conditional Automated Channel Pruning for Deep Neural Networks0
Show:102550
← PrevPage 24 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified