SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 26512700 of 4925 papers

TitleStatusHype
CSQ: Centered Symmetric Quantization for Extremely Low Bit Neural Networks0
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification0
CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation0
CTMQ: Cyclic Training of Convolutional Neural Networks with Multiple Quantization Steps0
CURSOR-BASED ADAPTIVE QUANTIZATION FOR DEEP NEURAL NETWORK0
Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape0
Custom Gradient Estimators are Straight-Through Estimators in Disguise0
D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving0
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning0
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation0
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech0
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement0
Data Augmentations in Deep Weight Spaces0
Data Clustering using a Hybrid of Fuzzy C-Means and Quantum-behaved Particle Swarm Optimization0
Data-Driven Deep Learning Based Hybrid Beamforming for Aerial Massive MIMO-OFDM Systems with Implicit CSI0
Data-Driven Depth Map Refinement via Multi-Scale Sparse Representation0
Data-driven Dynamic Event-triggered Control0
Data-Driven Sparsity-Based Restoration of JPEG-Compressed Images in Dual Transform-Pixel Domain0
Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks0
Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales0
Data-free mixed-precision quantization using novel sensitivity metric0
Data-Free Network Compression via Parametric Non-Uniform Mixed Precision Quantization0
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning0
Data-Free Quantization via Pseudo-label Filtering0
Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization0
Data-freeWeight Compress and Denoise for Large Language Models0
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning0
Dataset Distillation as Pushforward Optimal Quantization0
DB-LLM: Accurate Dual-Binarization for Efficient LLMs0
DBQ: A Differentiable Branch Quantizer for Lightweight Deep Neural Networks0
DCNGAN: A Deformable Convolutional-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video0
DC-PCN: Point Cloud Completion Network with Dual-Codebook Guided Quantization0
Discrete Cosine Transform Based Decorrelated Attention for Vision Transformers0
Decentralized Low-Rank Fine-Tuning of Large Language Models0
Decentralized Optimization on Compact Submanifolds by Quantized Riemannian Gradient Tracking0
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression0
Decomposing Normal and Abnormal Features of Medical Images into Discrete Latent Codes for Content-Based Image Retrieval0
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes0
Decoupled Greedy Learning of CNNs for Synchronous and Asynchronous Distributed Learning0
DEED: A General Quantization Scheme for Communication Efficiency in Bits0
Deep activity propagation via weight initialization in spiking neural networks0
Deep and Shallow Covariance Feature Quantization for 3D Facial Expression Recognition0
Deep Asymmetric Hashing with Dual Semantic Regression and Class Structure Quantization0
Deep Attentive Generative Adversarial Network for Photo-Realistic Image De-Quantization0
Deep Autoencoder-based Z-Interference Channels with Perfect and Imperfect CSI0
Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes0
Deep Conditional Measure Quantization0
Deep Convolutional Compression for Massive MIMO CSI Feedback0
Deep data compression for approximate ultrasonic image formation0
DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks0
Show:102550
← PrevPage 54 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified