SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 22512300 of 4925 papers

TitleStatusHype
MBQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network QuantizationCode1
Analyzing Compression Techniques for Computer Vision0
GSB: Group Superposition Binarization for Vision Transformer with Limited Training SamplesCode0
Quantization in Spiking Neural NetworksCode0
Accelerator-Aware Training for Transducer-Based Speech Recognition0
Patch-wise Mixed-Precision Quantization of Vision Transformer0
Speaker Diaphragm Excursion Prediction: deep attention and online adaptation0
PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel TransformerCode1
Mobile Image Restoration via Prior Quantization0
Post-training Model Quantization Using GANs for Synthetic Data GenerationCode0
Distribution-Flexible Subset Quantization for Post-Quantizing Super-Resolution NetworksCode1
Multiscale Augmented Normalizing Flows for Image Compression0
Spiking Neural Networks in the Alexiewicz Topology: A New Perspective on Analysis and Error BoundsCode0
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation0
Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation0
A multimodal dynamical variational autoencoder for audiovisual speech representation learningCode0
Vertical Federated Learning over Cloud-RAN: Convergence Analysis and System Optimization0
Emulation Learning for Neuromimetic Systems0
AQ-GT: a Temporally Aligned and Quantized GRU-Transformer for Co-Speech Gesture SynthesisCode1
Hybrid model for Single-Stage Multi-Person Pose Estimation0
ICQ: A Quantization Scheme for Best-Arm Identification Over Bit-Constrained Channels0
Guaranteed Quantization Error Computation for Neural Network Model Compression0
Killing Two Birds with One Stone: Quantization Achieves Privacy in Distributed Learning0
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks0
The Bjøntegaard Bible -- Why your Way of Comparing Video Codecs May Be WrongCode1
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D DetectionCode1
Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks0
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations0
Picking Up Quantization Steps for Compressed Image ClassificationCode0
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey0
Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric0
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scalingCode1
DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables0
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation0
ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction DetectionCode1
Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays0
Convergence rate of Tsallis entropic regularized optimal transport0
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs0
CABM: Content-Aware Bit Mapping for Single Image Super-Resolution Network with Large InputCode1
D-SVM over Networked Systems with Non-Ideal Linking Conditions0
Learning Accurate Performance Predictors for Ultrafast Automated Model CompressionCode0
Binary Latent DiffusionCode1
Unsupervised Multi-Criteria Adversarial Detection in Deep Image Retrieval0
SwiftTron: An Efficient Hardware Accelerator for Quantized TransformersCode1
EMP-SSL: Towards Self-Supervised Learning in One Training EpochCode2
Unsupervised Speech Representation Pooling Using Vector QuantizationCode0
Benchmarking the Robustness of Quantized Models0
Similarity search in the blink of an eye with compressed indicesCode2
FedDiSC: A Computation-efficient Federated Learning Framework for Power Systems Disturbance and Cyber Attack Discrimination0
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks0
Show:102550
← PrevPage 46 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified