SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 101150 of 1356 papers

TitleStatusHype
Initialization and Regularization of Factorized Neural LayersCode1
An Empirical Study of CLIP for Text-based Person SearchCode1
A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution RobustnessCode1
Dual Relation Knowledge Distillation for Object DetectionCode1
Knowledge Distillation with Refined LogitsCode1
Accurate Retraining-free Pruning for Pretrained Encoder-based Language ModelsCode1
Discrimination-aware Channel Pruning for Deep Neural NetworksCode1
An Information Theory-inspired Strategy for Automatic Network PruningCode1
DE-RRD: A Knowledge Distillation Framework for Recommender SystemCode1
Discrimination-aware Network Pruning for Deep Model CompressionCode1
Deep Compression for PyTorch Model Deployment on MicrocontrollersCode1
DarwinLM: Evolutionary Structured Pruning of Large Language ModelsCode1
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel ApproachCode1
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model FinetuningCode1
Densely Guided Knowledge Distillation using Multiple Teacher AssistantsCode1
Memory-Efficient Backpropagation through Large Linear LayersCode1
DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model GeneralizationCode1
MicroNet for Efficient Language ModelingCode1
3DG-STFM: 3D Geometric Guided Student-Teacher Feature MatchingCode1
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model CompressionCode1
AD-KD: Attribution-Driven Knowledge Distillation for Language Model CompressionCode1
A Real-time Low-cost Artificial Intelligence System for Autonomous Spraying in Palm PlantationsCode1
ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of MultipliersCode1
Basic Binary Convolution Unit for Binarized Image Restoration NetworkCode1
DiSparse: Disentangled Sparsification for Multitask Model CompressionCode1
Dynamic Channel Pruning: Feature Boosting and SuppressionCode1
Designing Large Foundation Models for Efficient Training and Inference: A SurveyCode1
Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian SplattingCode1
Contrastive Distillation on Intermediate Representations for Language Model CompressionCode1
Compression-Aware Video Super-ResolutionCode1
Composable Interventions for Language ModelsCode1
Compacting, Picking and Growing for Unforgetting Continual LearningCode1
Comprehensive Knowledge Distillation with Causal InterventionCode1
CompRess: Self-Supervised Learning by Compressing RepresentationsCode1
Contrastive Representation DistillationCode1
Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID DataCode1
Streamlining Redundant Layers to Compress Large Language ModelsCode1
Communication-Computation Trade-Off in Resource-Constrained Edge InferenceCode1
Computation-Efficient Knowledge Distillation via Uncertainty-Aware MixupCode1
A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor FusionCode1
Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer InferenceCode1
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side DistillationCode1
A Unified Pruning Framework for Vision TransformersCode1
CrossKD: Cross-Head Knowledge Distillation for Object DetectionCode1
CoA: Towards Real Image Dehazing via Compression-and-AdaptationCode1
Data-Free Network Quantization With Adversarial Knowledge DistillationCode1
Model LEGO: Creating Models Like Disassembling and Assembling Building BlocksCode1
Differentiable Model Compression via Pseudo Quantization NoiseCode1
Discovering Dynamic Patterns from Spatiotemporal Data with Time-Varying Low-Rank AutoregressionCode1
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision ModelsCode1
Show:102550
← PrevPage 3 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified