SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 26512700 of 4925 papers

TitleStatusHype
Patch-wise Mixed-Precision Quantization of Vision Transformer0
Post-training Model Quantization Using GANs for Synthetic Data GenerationCode0
Mobile Image Restoration via Prior Quantization0
Multiscale Augmented Normalizing Flows for Image Compression0
Spiking Neural Networks in the Alexiewicz Topology: A New Perspective on Analysis and Error BoundsCode0
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation0
Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation0
A multimodal dynamical variational autoencoder for audiovisual speech representation learningCode0
Emulation Learning for Neuromimetic Systems0
Vertical Federated Learning over Cloud-RAN: Convergence Analysis and System Optimization0
Hybrid model for Single-Stage Multi-Person Pose Estimation0
ICQ: A Quantization Scheme for Best-Arm Identification Over Bit-Constrained Channels0
Killing Two Birds with One Stone: Quantization Achieves Privacy in Distributed Learning0
Guaranteed Quantization Error Computation for Neural Network Model Compression0
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks0
Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks0
Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations0
Transformer-based models and hardware acceleration analysis in autonomous driving: A survey0
Picking Up Quantization Steps for Compressed Image ClassificationCode0
Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric0
DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables0
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation0
Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays0
Convergence rate of Tsallis entropic regularized optimal transport0
D-SVM over Networked Systems with Non-Ideal Linking Conditions0
Learning Accurate Performance Predictors for Ultrafast Automated Model CompressionCode0
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs0
Unsupervised Multi-Criteria Adversarial Detection in Deep Image Retrieval0
Benchmarking the Robustness of Quantized Models0
Unsupervised Speech Representation Pooling Using Vector QuantizationCode0
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks0
FedDiSC: A Computation-efficient Federated Learning Framework for Power Systems Disturbance and Cyber Attack Discrimination0
Blockwise Compression of Transformer-based Models without Retraining0
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation0
Distributed Optimization for Quadratic Cost Functions over Large-Scale Networks with Quantized Communication and Finite-Time Convergence0
FP8 versus INT8 for efficient deep learning inference0
A Joint Model and Data Driven Method for Distributed Estimation0
oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes0
SC-VAE: Sparse Coding-based Variational Autoencoder with Learned ISTACode0
Tetra-AML: Automatic Machine Learning via Tensor Networks0
Low-Dose CT Image Reconstruction using Vector Quantized Convolutional Autoencoder with Perceptual Loss0
Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis0
An Evaluation of Memory Optimization Methods for Training Neural Networks0
LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression0
Towards Accurate Post-Training Quantization for Vision Transformer0
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance0
The Quantization Model of Neural ScalingCode0
Scaled Quantization for the Vision Transformer0
Posthoc Interpretation via Quantization0
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT SystemsCode0
Show:102550
← PrevPage 54 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified