SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 251275 of 1356 papers

TitleStatusHype
Aerial Image Classification in Scarce and Unconstrained Environments via Conformal Prediction0
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMsCode0
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning0
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-DesignCode0
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models0
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model CompressionCode0
Compression Laws for Large Language Models0
RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation0
Compositionality Unlocks Deep Interpretable Models0
Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression0
Multi-Task Semantic Communications via Large Models0
Penrose Tiled Low-Rank Compression and Section-Wise Q&A Fine-Tuning: A General Framework for Domain-Specific Large Language Model Adaptation0
Delving Deep into Semantic Relation Distillation0
Boosting Large Language Models with Mask Fine-TuningCode0
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness0
A Low-Power Streaming Speech Enhancement Accelerator For Edge Devices0
Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration0
Temporal Action Detection Model Compression by Progressive Block Drop0
Large Language Model Compression via the Nested Activation-Aware Decomposition0
InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer0
CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting0
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models?0
Show:102550
← PrevPage 11 of 55Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified