SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 16511675 of 4925 papers

TitleStatusHype
Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training0
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality GapCode0
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice CodebooksCode4
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes0
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective FinetuningCode2
A Survey on Transformer Compression0
Quantized Approximately Orthogonal Recurrent Neural Networks0
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV CacheCode3
Optimal and Near-Optimal Adaptive Vector Quantization0
FoldToken: Learning Protein Language via Vector Quantization and Beyond0
Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network0
Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy EnvironmentsCode0
LQER: Low-Rank Quantization Error Reconstruction for LLMsCode1
Locally-Adaptive Quantization for Streaming Vector Search0
Ultrafast jet classification on FPGAs for the HL-LHCCode0
Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning0
Large Language Models for Time Series: A SurveyCode4
FedShift: Tackling Dual Heterogeneity Problem of Federated Learning via Weight Shift Aggregation0
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign DecodingCode0
Truncated Non-Uniform Quantization for Distributed SGD0
HW-SW Optimization of DNNs for Privacy-preserving People Counting on Low-resolution Infrared Arrays0
Neural Language of Thought Models0
Faster Inference of Integer SWIN Transformer by Removing the GELU Activation0
An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec0
Show:102550
← PrevPage 67 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified