SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 10011050 of 4925 papers

TitleStatusHype
Comprehensive Comparisons of Uniform Quantization in Deep Image CompressionCode0
Comprehensive Analysis of the Object Detection Pipeline on UAVsCode0
Compositional Sketch SearchCode0
Composite QuantizationCode0
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIsCode0
PQA: Exploring the Potential of Product Quantization in DNN Hardware AccelerationCode0
Make RepVGG Greater Again: A Quantization-aware ApproachCode0
Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-AirCode0
Maestro: Uncovering Low-Rank Structures via Trainable DecompositionCode0
ACCEPT: Adaptive Codebook for Composite and Efficient Prompt TuningCode0
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On MicrocontrollersCode0
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling MatricesCode0
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural NetworksCode0
A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional NetworksCode0
LSQ++: Lower running time and higher recall in multi-codebook quantizationCode0
Low-Precision Stochastic Gradient Langevin DynamicsCode0
Low Precision Decentralized Distributed Training over IID and non-IID DataCode0
Picking Up Quantization Steps for Compressed Image ClassificationCode0
Low-Precision Random Fourier Features for Memory-Constrained Kernel ApproximationCode0
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model CompressionCode0
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message PropagationCode0
Low-bit Model Quantization for Deep Neural Networks: A SurveyCode0
Low-bit Quantization of Neural Networks for Efficient InferenceCode0
LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-TuningCode0
Low-complexity acoustic scene classification for multi-device audio: analysis of DCASE 2021 Challenge systemsCode0
Low dimensional representation of multi-patient flow cytometry datasets using optimal transport for minimal residual disease detection in leukemiaCode0
LVPNet: A Latent-variable-based Prediction-driven End-to-end Framework for Lossless Compression of Medical ImagesCode0
Merge-Friendly Post-Training Quantization for Multi-Target Domain AdaptationCode0
Communication Efficient Private Federated Learning Using DitheringCode0
Log-Time K-Means Clustering for 1D Data: Novel Approaches with Proof and ImplementationCode0
Loss Aware Post-training QuantizationCode0
Communication-Efficient Multi-Device Inference Acceleration for Transformer ModelsCode0
Communication-Efficient Federated Learning via Predictive CodingCode0
Loss-aware Weight Quantization of Deep NetworksCode0
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and QuantizationCode0
Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural NetworksCode0
Additive Noise Annealing and Approximation Properties of Quantized Neural NetworksCode0
Communication-Efficient Federated Linear and Deep Generalized Canonical Correlation AnalysisCode0
LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and VulnerabilitiesCode0
A Quantization-Friendly Separable Convolution for MobileNetsCode0
Communication-Efficient Distributed Blockwise Momentum SGD with Error-FeedbackCode0
LISA: Learning Interpretable Skill Abstractions from LanguageCode0
Loss Landscape Analysis for Reliable Quantized ML Models for Scientific SensingCode0
Linearly Converging Error Compensated SGDCode0
Accelerating PoT Quantization on Edge DevicesCode0
Communication-Censored Distributed Stochastic Gradient DescentCode0
Lightweight Client-Side Chinese/Japanese Morphological Analyzer Based on Online LearningCode0
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality GapCode0
Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO SystemsCode0
Light Multi-segment Activation for Model CompressionCode0
Show:102550
← PrevPage 21 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified