SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 12511275 of 4925 papers

TitleStatusHype
xCOMET-lite: Bridging the Gap Between Efficiency and Quality in Learned MT Evaluation MetricsCode0
High-Fidelity Facial Albedo Estimation via Texture Quantization0
SDQ: Sparse Decomposed Quantization for LLM Inference0
Q-SNNs: Quantized Spiking Neural Networks0
Attention-aware Post-training Quantization without Backpropagation0
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates0
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language ModelsCode1
MSE Minimization in RIS-Aided MU-MIMO with Discrete Phase Shifts and Fronthaul Quantization0
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization0
QTIP: Quantization with Trellises and Incoherence ProcessingCode1
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint ShrinkingCode1
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%Code2
Autoregressive Image Generation without Vector QuantizationCode5
Deep-Learning-Based Channel Estimation for Distributed MIMO with 1-bit Radio-Over-Fiber Fronthaul0
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization0
Optimization of Armv9 architecture general large language model inference performance based on Llama.cppCode0
An Analysis on Quantizing Diffusion Transformers0
Promoting Data and Model Privacy in Federated Learning through Quantized LoRA0
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and ToolboxCode1
Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training0
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?0
Optimizing Byte-level Representation for End-to-end ASR0
Precipitation Nowcasting Using Physics Informed Discriminator Generative Models0
One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model0
GEB-1.3B: Open Lightweight Large Language Model0
Show:102550
← PrevPage 51 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified