SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 13011350 of 4925 papers

TitleStatusHype
Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise ReductionCode0
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and MemoryCode0
Hybrid coarse-fine classification for head pose estimationCode0
HyperFlow: Representing 3D Objects as SurfacesCode0
Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance AnalysisCode0
Improved Gradient based Adversarial Attacks for Quantized NetworksCode0
Homology-constrained vector quantization entropy regularizerCode0
High-Accuracy Low-Precision TrainingCode0
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision QuantizationCode0
Highly Optimized Kernels and Fine-Grained Codebooks for LLM Inference on Arm CPUsCode0
HOT: Hadamard-based Optimized TrainingCode0
A Mixed Quantization Network for Computationally Efficient Mobile Inverse Tone MappingCode0
Hierarchical Encoding of Sequential Data With Compact and Sub-Linear Storage CostCode0
Hierarchical Quantized Representations for Script GenerationCode0
BatchQuant: Quantized-for-all Architecture Search with Robust QuantizerCode0
HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization PerformanceCode0
BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized WeightsCode0
An efficient and straightforward online quantization method for a data stream through remove-birth updatingCode0
Depthwise Discrete Representation LearningCode0
BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural NetworksCode0
Hessian Aware Quantization of Spiking Neural NetworksCode0
Improved Knowledge Distillation for Crowd Counting on IoT DeviceCode0
LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-TuningCode0
DNN Feature Map Compression using Learned Representation over GF(2)Code0
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message PropagationCode0
Harnessing Large Language Models Locally: Empirical Results and Implications for AI PCCode0
Hardening DNNs against Transfer Attacks during Network Compression using Greedy Adversarial PruningCode0
Hardware Acceleration for Real-Time Wildfire Detection Onboard Drone NetworksCode0
Denoising Noisy Neural Networks: A Bayesian Approach with CompensationCode0
GT-SVQ: A Linear-Time Graph Transformer for Node Classification Using Spiking Vector QuantizationCode0
Guetzli: Perceptually Guided JPEG EncoderCode0
GSB: Group Superposition Binarization for Vision Transformer with Limited Training SamplesCode0
GraNNite: Enabling High-Performance Execution of Graph Neural Networks on Resource-Constrained Neural Processing UnitsCode0
A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-OffCode0
GQFedWAvg: Optimization-Based Quantized Federated Learning in General Edge Computing SystemsCode0
HDRUNet: Single Image HDR Reconstruction with Denoising and DequantizationCode0
FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going BeyondCode0
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMsCode0
Genie: Show Me the Data for QuantizationCode0
Deep Triplet QuantizationCode0
Deep Task-Based Analog-to-Digital ConversionCode0
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth GradientsCode0
Goten: GPU-Outsourcing Trusted Execution of Neural Network Training and PredictionCode0
Bag of Tricks for Optimizing Transformer EfficiencyCode0
DeepShift: Towards Multiplication-Less Neural NetworksCode0
General Point Model Pretraining with Autoencoding and AutoregressiveCode0
Deep reverse tone mappingCode0
Generalized Relevance Learning Grassmann QuantizationCode0
Deep residual network for steganalysis of digital imagesCode0
Deep Recurrent Quantization for Generating Sequential Binary CodesCode0
Show:102550
← PrevPage 27 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified