SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 20012050 of 4925 papers

TitleStatusHype
Gradient _1 Regularization for Quantization Robustness0
Gradient-Free Neural Network Training on the Edge0
Does Video Compression Impact Tracking Accuracy?0
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training0
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs0
Granger Causality from Quantized Measurements0
Countering Adversarial Examples: Combining Input Transformation and Noisy Training0
GranQ: Granular Zero-Shot Quantization with Channel-Wise Activation Scaling in QAT0
Graph-Based Depth Denoising & Dequantization for Point Cloud Enhancement0
Graph-Collaborated Auto-Encoder Hashing for Multi-view Binary Clustering0
An Empirical Study of Low Precision Quantization for TinyML0
Does compressing activations help model parallel training?0
Greedy Selection for Heterogeneous Sensors0
Greener yet Powerful: Taming Large Code Generation Models with Quantization0
An Embedded Iris Recognition System Optimization using Dynamically ReconfigurableDecoder with LDPC Codes0
Do All MobileNets Quantize Poorly? Gaining Insights into the Effect of Quantization on Depthwise Separable Convolutional Networks Through the Eyes of Multi-scale Distributional Dynamics0
DNQ: Dynamic Network Quantization0
DNN Quantization with Attention0
Gridless Multisnapshot Variational Line Spectral Estimation from Coarsely Quantized Samples0
Group channel pruning and spatial attention distilling for object detection0
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free0
Group Invariant Deep Representations for Image Instance Retrieval0
SwiftPrune: Hessian-Free Weight Pruning for Large Language Models0
Hybrid and Non-Uniform DNN quantization methods using Retro Synthesis data for efficient inference0
Group Sparse Coding0
CQ-VAE: Coordinate Quantized VAE for Uncertainty Estimation with Application to Disk Shape Analysis from Lumbar Spine MRI Images0
DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization0
GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field0
Guaranteed Quantization Error Computation for Neural Network Model Compression0
CRB Analysis for Mixed-ADC Based DOA Estimation0
Biologically Plausible Learning on Neuromorphic Hardware Architectures0
Bioinspired Cortex-based Fast Codebook Generation0
Gull: A Generative Multifunctional Audio Codec0
GWQ: Gradient-Aware Weight Quantization for Large Language Models0
Haar Wavelet Feature Compression for Quantized Graph Convolutional Networks0
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference0
HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference0
An Efficient Network with Novel Quantization Designed for Massive MIMO CSI Feedback0
Hadamard Domain Training with Integers for Class Incremental Quantized Learning0
HadaNets: Flexible Quantization Strategies for Neural Networks0
HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations0
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis0
Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural Networks0
Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks0
HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration0
LANA: Latency Aware Network Acceleration0
BinaryViT: Towards Efficient and Accurate Binary Vision Transformers0
An Efficient Index for Visual Search in Appearance-based SLAM0
Diversifying Sample Generation for Accurate Data-Free Quantization0
QVGen: Pushing the Limit of Quantized Video Generative Models0
Show:102550
← PrevPage 41 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified