SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 851900 of 1356 papers

TitleStatusHype
Unraveling Key Factors of Knowledge Distillation0
Unsupervised model compression for multilayer bootstrap networks0
UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles0
USDC: Unified Static and Dynamic Compression for Visual Transformer0
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models0
Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training0
Variational autoencoder-based neural network model compression0
VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks0
Vision Foundation Models in Medical Image Analysis: Advances and Challenges0
Vision-Language Models for Edge Networks: A Comprehensive Survey0
Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies0
VQ4ALL: Efficient Neural Network Representation via a Universal Codebook0
Wasserstein Contrastive Representation Distillation0
Watermarking Graph Neural Networks by Random Graphs0
WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations0
Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators0
Weight Normalization based Quantization for Deep Neural Network Compression0
Weight Squeezing: Reparameterization for Knowledge Transfer and Model Compression0
Weight Squeezing: Reparameterization for Compression and Fast Inference0
Weight Squeezing: Reparameterization for Knowledge Transfer and Model Compression0
Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding0
What do larger image classifiers memorise?0
What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias0
What is Lost in Knowledge Distillation?0
What Makes a Good Dataset for Knowledge Distillation?0
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models0
XAI-BayesHAR: A novel Framework for Human Activity Recognition with Integrated Uncertainty and Shapely Values0
YANMTT: Yet Another Neural Machine Translation Toolkit0
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning0
Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models0
InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries0
Inference Optimization of Foundation Models on AI Accelerators0
Information-Theoretic GAN Compression with Variational Energy-based Model0
Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection0
InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer0
INSIGHT: A Survey of In-Network Systems for Intelligent, High-Efficiency AI and Topology Optimization0
Instance-Aware Group Quantization for Vision Transformers0
Integral Pruning on Activations and Weights for Efficient Neural Networks0
PublicCheck: Public Integrity Verification for Services of Run-time Deep Models0
Interpreting Deep Classifier by Visual Distillation of Dark Knowledge0
Redundancy and Concept Analysis for Code-trained Language Models0
Intrinsically Sparse Long Short-Term Memory Networks0
Investigation of Practical Aspects of Single Channel Speech Separation for ASR0
Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing0
IteRABRe: Iterative Recovery-Aided Block Reduction0
Iterative Compression of End-to-End ASR Model using AutoML0
It's always personal: Using Early Exits for Efficient On-Device CNN Personalisation0
Joint Neural Architecture Search and Quantization0
Joint Regularization on Activations and Weights for Efficient Neural Network Pruning0
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification0
Show:102550
← PrevPage 18 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified