SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 14761500 of 4925 papers

TitleStatusHype
Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network AcceleratorsCode0
Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model MergingCode1
BinaryDM: Accurate Weight Binarization for Efficient Diffusion ModelsCode1
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws0
Investigating the Impact of Quantization on Adversarial Robustness0
David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep EdgeCode0
Nanometer Scanning with Micrometer Sensing: Beating Quantization Constraints in Lissajous Trajectory Tracking0
Gull: A Generative Multifunctional Audio Codec0
Weakly Supervised Deep Hyperspherical Quantization for Image RetrievalCode0
What Happens When Small Is Made Smaller? Exploring the Impact of Compression on Small Data Pretrained Language Models0
Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes0
Outlier-Efficient Hopfield Layers for Large Transformer-Based ModelsCode1
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation RegularizationCode0
TinyVQA: Compact Multimodal Deep Neural Network for Visual Question Answering on Resource-Constrained Devices0
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-ResolutionCode2
DI-Retinex: Digital-Imaging Retinex Theory for Low-Light Image Enhancement0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models0
Efficient Multi-Vector Dense Retrieval Using Bit VectorsCode2
DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization0
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language ModelsCode3
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation0
On the Effect of Quantization on Dynamic Mode Decomposition0
RefQSR: Reference-based Quantization for Image Super-Resolution Networks0
Minimize Quantization Output Error with Bias CompensationCode0
Show:102550
← PrevPage 60 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified