SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 326350 of 4925 papers

TitleStatusHype
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank ResidualsCode1
Relation-Guided Adversarial Learning for Data-free Knowledge TransferCode1
MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion ModelsCode1
Lexico: Extreme KV Cache Compression via Sparse Coding over Universal DictionariesCode1
BiDM: Pushing the Limit of Quantization for Diffusion ModelsCode1
Temporally Compressed 3D Gaussian Splatting for Dynamic ScenesCode1
Improving Detail in Pluralistic Image Inpainting with Feature DequantizationCode1
DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined RotationCode1
Quantization without TearsCode1
MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling QuantizationCode1
Privacy-Preserving Graph-Based Machine Learning with Fully Homomorphic Encryption for Collaborative Anti-Money LaunderingCode1
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector QuantizationCode1
Abstracted Shapes as Tokens -- A Generalizable and Interpretable Model for Time-series ClassificationCode1
BitStack: Any-Size Compression of Large Language Models in Variable Memory EnvironmentsCode1
IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion ModelsCode1
Vector Quantization Prompting for Continual LearningCode1
Catastrophic Failure of LLM Unlearning via QuantizationCode1
Residual vector quantization for KV cache compression in large language modelCode1
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary SearchCode1
Learning Graph Quantized TokenizersCode1
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMsCode1
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural NetworksCode1
SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight CompressionCode1
QT-DoG: Quantization-aware Training for Domain GeneralizationCode1
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector QuantizationCode1
Show:102550
← PrevPage 14 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified