SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 451500 of 1356 papers

TitleStatusHype
Compressing Convolutional Neural Networks via Factorized Convolutional FiltersCode0
Exploiting Kernel Sparsity and Entropy for Interpretable CNN CompressionCode0
FedSynth: Gradient Compression via Synthetic Data in Federated LearningCode0
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision TransformerCode0
From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model CompressionCode0
Compressed models are NOT miniature versions of large models0
Artemis: HE-Aware Training for Efficient Privacy-Preserving Machine Learning0
Comprehensive Survey of Model Compression and Speed up for Vision Transformers0
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices0
Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models0
ESPACE: Dimensionality Reduction of Activations for Model Compression0
Compositionality Unlocks Deep Interpretable Models0
A Comprehensive Review and a Taxonomy of Edge Machine Learning: Requirements, Paradigms, and Techniques0
Accelerating Very Deep Convolutional Networks for Classification and Detection0
EPSD: Early Pruning with Self-Distillation for Efficient Model Compression0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting0
Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks0
Enhancing Targeted Attack Transferability via Diversified Weight Pruning0
Complexity-Driven CNN Compression for Resource-constrained Edge AI0
Architecture Compression0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Enhanced Sparsification via Stimulative Training0
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks0
Energy-efficient Knowledge Distillation for Spiking Neural Networks0
EncCluster: Scalable Functional Encryption in Federated Learning through Weight Clustering and Probabilistic Filters0
Compact CNN Structure Learning by Knowledge Distillation0
A Progressive Sub-Network Searching Framework for Dynamic Inference0
A Deep Cascade Network for Unaligned Face Attribute Classification0
Accelerating Machine Learning Primitives on Commodity Hardware0
Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer0
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels0
Enabling All In-Edge Deep Learning: A Literature Review0
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications0
Communication-Efficient Federated Learning with Adaptive Compression under Dynamic Bandwidth0
Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations0
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models0
ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks0
Communication-Efficient Distributed Online Learning with Kernels0
A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework0
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models0
EPIM: Efficient Processing-In-Memory Accelerators based on Epitome0
Efficient Transformer Knowledge Distillation: A Performance Review0
Error-aware Quantization through Noise Tempering0
Approximability and Generalisation0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models0
Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model Compression0
Efficient Speech Representation Learning with Low-Bit Quantization0
Efficient Recurrent Neural Networks using Structured Matrices in FPGAs0
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders0
Show:102550
← PrevPage 10 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified