SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 34013450 of 4925 papers

TitleStatusHype
A 14uJ/Decision Keyword Spotting Accelerator with In-SRAM-Computing and On Chip Learning for Customization0
A^2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization0
A 2-bit Wideband 5G mm-Wave RIS with Low Side Lobe Levels and no Quantization Lobe0
A3 : an Analytical Low-Rank Approximation Framework for Attention0
A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps0
DeltaKWS: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM0
A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface0
A 71.2-μW Speech Recognition Accelerator with Recurrent Spiking Neural Network0
A Bag of Tricks for Scaling CPU-based Deep FFMs to more than 300m Predictions per Second0
A binary-activation, multi-level weight RNN and training algorithm for ADC-/DAC-free and noise-resilient processing-in-memory inference with eNVM0
Ab-initio quantum chemistry with neural-network wavefunctions0
A Biresolution Spectral Framework for Product Quantization0
A blob method for inhomogeneous diffusion with applications to multi-agent control and sampling0
A Blockchain Solution for Collaborative Machine Learning over IoT0
Single-path Bit Sharing for Automatic Loss-aware Model Compression0
Abstractive summarization from Audio Transcription0
A Carbon Tracking Model for Federated Learning: Impact of Quantization and Sparsification0
Accelerated AI Inference via Dynamic Execution Methods0
Accelerated Distance Computation with Encoding Tree for High Dimensional Data0
Accelerating Deep Learning Inference via Freezing0
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime0
Accelerating Deep Learning with Dynamic Data Pruning0
Accelerating Energy-Efficient Federated Learning in Cell-Free Networks with Adaptive Quantization0
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization0
Accelerating Neural Network Inference by Overflow Aware Quantization0
Accelerating RNN-based Speech Enhancement on a Multi-Core MCU with Mixed FP16-INT8 Post-Training Quantization0
Acceleration for Compressed Gradient Descent in Distributed Optimization0
Acceleration of Convolutional Neural Network Using FFT-Based Split Convolutions0
Accelerator-Aware Training for Transducer-Based Speech Recognition0
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design0
Accumulator-Aware Post-Training Quantization0
Accuracy is Not All You Need0
Accuracy to Throughput Trade-offs for Reduced Precision Neural Networks on Reconfigurable Logic0
Accurate Block Quantization in LLMs with Outliers0
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization0
Accurate Deep Representation Quantization with Gradient Snapping Layer for Similarity Search0
Accurate INT8 Training Through Dynamic Block-Level Fallback0
Accurate Sine-Wave Amplitude Measurements Using Nonlinearly Quantized Data0
A Channelized Binning Method for Extraction of Dominant Color Pixel Value0
Achieving binary weight and activation for LLMs using Post-Training Quantization0
Achieving Robustness in Blind Modulo Analog-to-Digital Conversion0
Differentially Quantized Gradient Methods0
Lean classical-quantum hybrid neural network model for image classification0
A Closed-loop Sleep Modulation System with FPGA-Accelerated Deep Learning0
A CNN-based Prediction-Aware Quality Enhancement Framework for VVC0
A Genetic Algorithm Approach for ImageRepresentation Learning through Color Quantization0
A Compact and Discriminative Face Track Descriptor0
A comparative study of several parameterizations for speaker recognition0
A comparative study of several ADPCM schemes with linear and nonlinear prediction0
A comparison study of CNN denoisers on PRNU extraction0
Show:102550
← PrevPage 69 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified