SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 45014550 of 4925 papers

TitleStatusHype
Quantizing deep convolutional networks for efficient inference: A whitepaperCode0
A2Q: Accumulator-Aware Quantization with Guaranteed Overflow AvoidanceCode0
Deep residual network for steganalysis of digital imagesCode0
LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal TextCode0
Ultrafast jet classification on FPGAs for the HL-LHCCode0
Large Scale Clustering with Variational EM for Gaussian Mixture ModelsCode0
QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noiseCode0
Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural NetworksCode0
AdaBits: Neural Network Quantization with Adaptive Bit-WidthsCode0
Bit Error Robustness for Energy-Efficient DNN AcceleratorsCode0
Towards Lossless ANN-SNN Conversion under Ultra-Low Latency with Dual-Phase OptimizationCode0
Communication-Efficient Distributed Blockwise Momentum SGD with Error-FeedbackCode0
Deep Recurrent Quantization for Generating Sequential Binary CodesCode0
Exact Backpropagation in Binary Weighted Networks with Group Weight TransformationsCode0
Automated Cancer Subtyping via Vector Quantization Mutual Information MaximizationCode0
Deep Priority HashingCode0
On Quantizing Neural Representation for Variable-Rate Video CodingCode0
Evaluating Single Event Upsets in Deep Neural Networks for Semantic Segmentation: an embedded system perspectiveCode0
Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language BenchmarksCode0
On Resource-Efficient Bayesian Network Classifiers and Deep Neural NetworksCode0
Communication-Censored Distributed Stochastic Gradient DescentCode0
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the EdgeCode0
Evaluating Large Language Models on the Frame and Symbol Grounding Problems: A Zero-shot BenchmarkCode0
Deep Optimized Multiple Description Image Coding via Scalar Quantization LearningCode0
SHE: A Fast and Accurate Deep Neural Network for Encrypted DataCode0
On the Discrepancy between the Theoretical Analysis and Practical Implementations of Compressed Communication for Distributed Deep LearningCode0
Estimation and Restoration of Unknown Nonlinear Distortion using DiffusionCode0
On the Downstream Performance of Compressed Word EmbeddingsCode0
ES-ENAS: Efficient Evolutionary Optimization for Large Hybrid Search SpacesCode0
Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMDCode0
Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model TuningCode0
Error Diffusion Halftoning Against Adversarial ExamplesCode0
ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural NetworksCode0
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model PerformanceCode0
Unbounded cache model for online language modeling with open vocabularyCode0
Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural NetworksCode0
Equal Bits: Enforcing Equally Distributed Binary Network WeightsCode0
Shifting Capsule Networks from the Cloud to the Deep EdgeCode0
4bit-Quantization in Vector-Embedding for RAGCode0
QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial AttacksCode0
Deep Neural Network Compression with Single and Multiple Level QuantizationCode0
QVIP: An ILP-based Formal Verification Approach for Quantized Neural NetworksCode0
On the Perturbed States for Transformed Input-robust Reinforcement LearningCode0
enpheeph: A Fault Injection Framework for Spiking and Compressed Deep Neural NetworksCode0
Enhancing Low-Precision Sampling via Stochastic Gradient Hamiltonian Monte CarloCode0
End-to-end Learning of Deep Visual Representations for Image RetrievalCode0
End-to-End Human Pose Reconstruction from Wearable Sensors for 6G Extended Reality SystemsCode0
DIVISION: Memory Efficient Training via Dual Activation PrecisionCode0
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network AccelerationCode0
QWID: Quantized Weed Identification Deep neural networkCode0
Show:102550
← PrevPage 91 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified