SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 33013350 of 4925 papers

TitleStatusHype
What Does a One-Bit Quanta Image Sensor Offer?0
What Happens When Small Is Made Smaller? Exploring the Impact of Compression on Small Data Pretrained Language Models0
An Evaluation of Memory Optimization Methods for Training Neural Networks0
What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation0
When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization0
When Bio-Inspired Computing meets Deep Learning: Low-Latency, Accurate, & Energy-Efficient Spiking Neural Networks from Artificial Neural Networks0
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models0
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks0
Where Should We Begin? A Low-Level Exploration of Weight Initialization Impact on Quantized Behaviour of Deep Neural Networks0
Which Space Partitioning Tree to Use for Search?0
DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition0
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks0
Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs0
Widening and Squeezing: Towards Accurate and Efficient QNNs0
Winning Amazon KDD Cup'240
Wireless End-to-End Image Transmission System using Semantic Communications0
Wireless Quantized Federated Learning: A Joint Computation and Communication Design0
Within-basket Recommendation via Neural Pattern Associator0
Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose Sequence0
Witten-type topological field theory of self-organized criticality for stochastic neural networks0
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More0
Word-based Domain Adaptation for Neural Machine Translation0
Work in Progress: Linear Transformers for TinyML0
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic0
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic0
WRPN: Training and Inference using Wide Reduced-Precision Networks0
WSMN: An optimized multipurpose blind watermarking in Shearlet domain using MLP and NSGA-II0
WSNet: Compact and Efficient Networks Through Weight Sampling0
WSNet: Learning Compact and Efficient Networks with Weight Sampling0
Wyner-Ziv Gradient Compression for Federated Learning0
XCAT -- Lightweight Quantized Single Image Super-Resolution using Heterogeneous Group Convolutions and Cross Concatenation0
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks0
XNOR-Net++: Improved Binary Neural Networks0
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers0
You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models0
YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving0
Consistent Signal Reconstruction from Streaming Multivariate Time Series0
Zero-Delay Gaussian Joint Source-Channel Coding for the Interference Channel0
FDC: Fast KV Dimensionality Compression for Efficient LLM Inference0
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training0
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats0
ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers0
Zero-shot Adversarial Quantization0
Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network Quantization0
Zero-shot Quantization: A Comprehensive Survey0
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models0
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity0
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning0
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification0
ZOBNN: Zero-Overhead Dependable Design of Binary Neural Networks with Deliberately Quantized Parameters0
Show:102550
← PrevPage 67 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified