SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 47014750 of 4925 papers

TitleStatusHype
Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask WeightsCode0
Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance MinimizationCode0
Decoupling Meta-Reinforcement Learning with Gaussian Task Contexts and SkillsCode0
A Simple Low-bit Quantization Framework for Video Snapshot Compressive ImagingCode0
Soft Weight-Sharing for Neural Network CompressionCode0
Regularized Classification-Aware QuantizationCode0
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision QuantizationCode0
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIsCode0
Improving Self-Supervised Learning-based MOS Prediction NetworksCode0
David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep EdgeCode0
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge NodesCode0
Efficient Cross-Modal Retrieval via Deep Binary Hashing and QuantizationCode0
A Mixed Quantization Network for Computationally Efficient Mobile Inverse Tone MappingCode0
Playing Atari with Six NeuronsCode0
PQA: Exploring the Potential of Product Quantization in DNN Hardware AccelerationCode0
Improving Robustness Against Stealthy Weight Bit-Flip Attacks by Output Code MatchingCode0
PMQ-VE: Progressive Multi-Frame Quantization for Video EnhancementCode0
Improving Neural Network Quantization without Retraining using Outlier Channel SplittingCode0
Relaxed Quantization for Discretized Neural NetworksCode0
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit IntegersCode0
Improved Knowledge Distillation for Crowd Counting on IoT DeviceCode0
Improved Gradient based Adversarial Attacks for Quantized NetworksCode0
Central Similarity Quantization for Efficient Image and Video RetrievalCode0
Implicit Feature Decoupling with Depthwise QuantizationCode0
Data Upcycling Knowledge Distillation for Image Super-ResolutionCode0
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMsCode0
Web-Scale Image Clustering RevisitedCode0
Efficient course recommendations with T5-based ranking and summarizationCode0
Image Hashing by Minimizing Discrete Component-wise Wasserstein DistanceCode0
Post-training 4-bit quantization of convolution networks for rapid-deploymentCode0
Post training 4-bit quantization of convolutional networks for rapid-deploymentCode0
A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional NetworksCode0
Post-training Model Quantization Using GANs for Synthetic Data GenerationCode0
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight QuantizationCode0
Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference EnginesCode0
Causal-DFQ: Causality Guided Data-free Network QuantizationCode0
REMIND Your Neural Network to Prevent Catastrophic ForgettingCode0
Remote Inference over Dynamic Links via Adaptive Rate Deep Task-Oriented Vector QuantizationCode0
Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance AnalysisCode0
Efficient computation of counterfactual explanations of LVQ modelsCode0
Post-Training Quantization for Re-parameterization via Coarse & Fine Weight SplittingCode0
RepBNN: towards a precise Binary Neural Network with Enhanced Feature Map via RepeatingCode0
Efficient CNN-LSTM based Image Captioning using Neural Network CompressionCode0
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMsCode0
A Quantization-Friendly Separable Convolution for MobileNetsCode0
IBVC: Interpolation-driven B-frame Video CompressionCode0
Focused Quantization for Sparse CNNsCode0
CUCL: Codebook for Unsupervised Continual LearningCode0
Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation RelaxingCode0
Hyper-Sphere Quantization: Communication-Efficient SGD for Federated LearningCode0
Show:102550
← PrevPage 95 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified