SOTAVerified

Model Compression

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Papers

Showing 231240 of 1356 papers

TitleStatusHype
Language Model Knowledge Distillation for Efficient Question Answering in SpanishCode0
Learning Intrinsic Sparse Structures within Long Short-Term MemoryCode0
Chemical transformer compression for accelerating both training and inference of molecular modelingCode0
Annealing Knowledge DistillationCode0
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMsCode0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
A Programmable Approach to Neural Network CompressionCode0
I3D: Transformer architectures with input-dependent dynamic depth for speech recognitionCode0
Change Is the Only Constant: Dynamic LLM Slicing based on Layer RedundancyCode0
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and MemoryCode0
Show:102550
← PrevPage 24 of 136Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MobileBERT + 2bit-1dim model compression using DKMAccuracy82.13Unverified
2MobileBERT + 1bit-1dim model compression using DKMAccuracy63.17Unverified