SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 23512400 of 4925 papers

TitleStatusHype
A comprehensive review of Binary Neural Network0
Interest Point Detection based on Adaptive Ternary Coding0
Joint Neural Architecture Search and Quantization0
Joint Optimization of Rate, Distortion, and Decoding Energy for HEVC Intraframe Coding0
Modified Vector Quantization for Small-Cell Access Point Placement with Inter-Cell Interference0
Bang for the Buck: Vector Search on Cloud CPUs0
Interactions Across Blocks in Post-Training Quantization of Large Language Models0
Joint Quantization and Pruning Neural Networks Approach: A Case Study on FSO Receivers0
Reconfigurable Intelligent Surface-induced Randomness for mmWave Key Generation0
Joint SPX-VIX calibration with Gaussian polynomial volatility models: deep pricing with quantization hints0
A White Paper on Neural Network Quantization0
Joint Texture and Geometry Optimization for RGB-D Reconstruction0
Joshua 4.0: Packing, PRO, and Paraphrases0
Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime Optimization0
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations0
JPEG Quantized Coefficient Recovery via DCT Domain Spatial-Frequential Transformer0
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models0
Learning low-precision neural networks without Straight-Through Estimator(STE)0
Intelligent Fault Diagnosis of Type and Severity in Low-Frequency, Low Bit-Depth Signals0
Integrating PHY Security Into NDN-IoT Networks By Exploiting MEC: Authentication Efficiency, Robustness, and Accuracy Enhancement0
Deep neural networks are robust to weight binarization and other non-linear distortions0
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization0
Kernel k-Medoids as General Vector Quantization0
Kernel Quantization for Efficient Network Compression0
AWEQ: Post-Training Quantization with Activation-Weight Equalization for Large Language Models0
Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis0
Killing Two Birds with One Stone: Quantization Achieves Privacy in Distributed Learning0
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression0
K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes0
Knowledge Distillation: A Survey0
Knowledge Distillation in Vision Transformers: A Critical Review0
Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning0
Learning Linear Block Codes with Gradient Quantization0
Designing a Classifier for Active Fire Detection from Multispectral Satellite Imagery Using Neural Architecture Search0
Kramers-Kronig Receiver Combined With Digital Resolution Enhancer0
KurTail : Kurtosis-based LLM Quantization0
Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs0
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization0
KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache0
Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications0
Deep Neural Network Models Compression0
L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks0
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models0
Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models0
A Wave is Worth 100 Words: Investigating Cross-Domain Transferability in Time Series0
LAMBDA: Covering the Solution Set of Black-Box Inequality by Search Space Quantization0
LANCE: Efficient Low-Precision Quantized Winograd Convolution for Neural Networks Based on Graphics Processing Units0
Design of Stochastic Quantizers for Privacy Preservation0
A Low Memory Footprint Quantized Neural Network for Depth Completion of Very Sparse Time-of-Flight Depth Maps0
Learning Sparse Low-Precision Neural Networks With Learnable Regularization0
Show:102550
← PrevPage 48 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified