SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 201250 of 1356 papers

TitleStatusHype
RLx2: Training a Sparse Deep Reinforcement Learning Model from ScratchCode1
Distilling Linguistic Context for Language Model CompressionCode1
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature MatchingCode1
Skip-Convolutions for Efficient Video ProcessingCode1
An Empirical Study of CLIP for Text-based Person SearchCode1
Sparse Probabilistic Circuits via Pruning and GrowingCode1
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product OperatorsCode1
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and UnderstandingCode1
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer ProgrammingCode1
Artemis: HE-Aware Training for Efficient Privacy-Preserving Machine Learning0
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices0
Accelerating Very Deep Convolutional Networks for Classification and Detection0
Accelerating Machine Learning Primitives on Commodity Hardware0
Compositionality Unlocks Deep Interpretable Models0
CORSD: Class-Oriented Relational Self Distillation0
Cosine Similarity Knowledge Distillation for Individual Class Information Transfer0
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting0
Complexity-Driven CNN Compression for Resource-constrained Edge AI0
Architecture Compression0
A Progressive Sub-Network Searching Framework for Dynamic Inference0
Compact CNN Structure Learning by Knowledge Distillation0
A Deep Cascade Network for Unaligned Face Attribute Classification0
Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning0
CoSurfGS:Collaborative 3D Surface Gaussian Splatting with Distributed Learning for Large Scene Reconstruction0
Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity0
A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework0
2-bit Conformer quantization for automatic speech recognition0
Approximability and Generalisation0
CoLLD: Contrastive Layer-to-layer Distillation for Compressing Multilingual Pre-trained Speech Encoders0
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy0
Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization0
Collaborative Teacher-Student Learning via Multiple Knowledge Transfer0
ADC/DAC-Free Analog Acceleration of Deep Neural Networks with Frequency Transformation0
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization0
Communication-Efficient Distributed Online Learning with Kernels0
Applications of Knowledge Distillation in Remote Sensing: A Survey0
Communication-Efficient Federated Learning with Adaptive Compression under Dynamic Bandwidth0
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization0
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
A Partial Regularization Method for Network Compression0
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks0
Compacting Deep Neural Networks for Internet of Things: Methods and Applications0
Closed-Loop Neural Interfaces with Embedded Machine Learning0
Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications0
An Overview of Neural Network Compression0
AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications0
10K is Enough: An Ultra-Lightweight Binarized Network for Infrared Small-Target Detection0
Convolutional Neural Network Compression Based on Low-Rank Decomposition0
A Comprehensive Review and a Taxonomy of Edge Machine Learning: Requirements, Paradigms, and Techniques0
Integrating Fairness and Model Pruning Through Bi-level Optimization0
Show:102550
← PrevPage 5 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified