SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 40514100 of 4925 papers

TitleStatusHype
Embedding Compression with Isotropic Iterative Quantization0
Gaussian Approximation of Quantization Error for Estimation from Compressed Data0
Resource-Efficient Neural Networks for Embedded Systems0
RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks0
Attention based on-device streaming speech recognition with large speech corpus0
Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript0
Acceleration for Compressed Gradient Descent in Distributed Optimization0
Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers0
Differentiable Product Quantization for Learning Compact Embedding Layers0
Efficient Systolic Array Based on Decomposable MAC for Quantized Deep Neural Networks0
New Loss Functions for Fast Maximum Inner Product Search0
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection0
Towards Unified INT8 Training for Convolutional Neural Network0
AdaBits: Neural Network Quantization with Adaptive Bit-WidthsCode0
EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNetsCode0
FQ-Conv: Fully Quantized Convolution for Efficient and Accurate Inference0
Interleaved Composite Quantization for High-Dimensional Similarity Search0
Adaptive Loss-aware Quantization for Multi-bit NetworksCode0
Neural Networks Weights Quantization: Target None-retraining Ternary (TNT)0
Efficient Error-Tolerant Quantized Neural Network Accelerators0
Attention network forecasts time-to-failure in laboratory shear experiments0
Learned Variable-Rate Image Compression with Residual Divisive Normalization0
Maximum Average Entropy-Based Quantization of Local Observations for Distributed Detection0
Compressing 3DCNNs Based on Tensor Train Decomposition0
Tensor Recovery from Noisy and Multi-Level Quantized Measurements0
RTN: Reparameterized Ternary Network0
Deep Model Compression Via Two-Stage Deep Reinforcement Learning0
EDAS: Efficient and Differentiable Architecture Search0
Optimizing the energy consumption of spiking neural networks for neuromorphic applicationsCode0
Coresets for Archetypal AnalysisCode0
Generalization Error Analysis of Quantized Compressive Learning0
Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural NetworksCode0
Post training 4-bit quantization of convolutional networks for rapid-deploymentCode0
Normalization Helps Training of Quantized LSTMCode0
The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic0
Random Projections with Asymmetric Quantization0
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local ComputationsCode0
A binary-activation, multi-level weight RNN and training algorithm for ADC-/DAC-free and noise-resilient processing-in-memory inference with eNVM0
Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization0
QKD: Quantization-aware Knowledge Distillation0
Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices0
Music Source Separation in the Waveform Domain0
Two-Stage Learning for Uplink Channel Estimation in One-Bit Massive MIMO0
Model-Aware Deep Architectures for One-Bit Compressive Variational AutoencodingCode0
Pyramid Vector Quantization and Bit Level Sparsity in Weights for Efficient Neural Networks Inference0
A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation0
Quantization NetworksCode0
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech0
IFQ-Net: Integrated Fixed-point Quantization Networks for Embedded Vision0
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep LearningCode0
Show:102550
← PrevPage 82 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified