Model Compression
Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.
Papers
Showing 1–10 of 1356 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MobileBERT + 2bit-1dim model compression using DKM | Accuracy | 82.13 | — | Unverified |
| 2 | MobileBERT + 1bit-1dim model compression using DKM | Accuracy | 63.17 | — | Unverified |