HPTQ: Hardware-Friendly Post Training Quantization
Hai Victor Habi, Reuven Peretz, Elad Cohen, Lior Dikstein, Oranit Dror, Idit Diamant, Roy H. Jennings, Arnon Netzer
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/sony/model_optimizationOfficialIn paperpytorch★ 432
Abstract
Neural network quantization enables the deployment of models on edge devices. An essential requirement for their hardware efficiency is that the quantizers are hardware-friendly: uniform, symmetric, and with power-of-two thresholds. To the best of our knowledge, current post-training quantization methods do not support all of these constraints simultaneously. In this work, we introduce a hardware-friendly post training quantization (HPTQ) framework, which addresses this problem by synergistically combining several known quantization methods. We perform a large-scale study on four tasks: classification, object detection, semantic segmentation and pose estimation over a wide variety of network architectures. Our extensive experiments show that competitive results can be obtained under hardware-friendly constraints.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| COCO (Common Objects in Context) | SSD ResNet50 V1 FPN 640x640 | MAP | 34.3 | — | Unverified |
| ImageNet | Xception W8A8 | Top-1 Accuracy (%) | 78.97 | — | Unverified |
| ImageNet | EfficientNet-B0 ReLU W8A8 | Top-1 Accuracy (%) | 77.09 | — | Unverified |
| ImageNet | EfficientNet-B0 W8A8 | Top-1 Accuracy (%) | 74.22 | — | Unverified |
| ImageNet | DenseNet-121 W8A8 | Top-1 Accuracy (%) | 73.36 | — | Unverified |
| ImageNet | MobileNetV2 W8A8 | Top-1 Accuracy (%) | 71.46 | — | Unverified |