SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 951975 of 4925 papers

TitleStatusHype
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation0
Towards AI-Native Fronthaul: Neural Compression for NextG Cloud RAN0
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical ModelCode0
Bridging the Modality Gap: Softly Discretizing Audio Representation for LLM-based Automatic Speech Recognition0
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning0
Massive MIMO with 1-Bit DACs: Data Detection for Quantized Linear Precoding with Dithering0
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion0
Kernel k-Medoids as General Vector Quantization0
TaDA: Training-free recipe for Decoding with Adaptive KV Cache Compression and Mean-centering0
FPTQuant: Function-Preserving Transforms for LLM Quantization0
PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling0
STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector QuantizationCode0
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing0
Nonlinear Sparse Bayesian Learning Methods with Application to Massive MIMO Channel Estimation with Hardware Impairments0
Quantized Dissipative Uncertain Model for Fractional T_S Fuzzy systems with Time_Varying Delays Under Networked Control System0
MUC-G4: Minimal Unsat Core-Guided Incremental Verification for Deep Neural Network Compression0
Enhancing Convergence, Privacy and Fairness for Wireless Personalized Federated Learning: Quantization-Assisted Min-Max Fair Scheduling0
Flexible Mixed Precision Quantization for Learned Image CompressionCode0
Enhancing Speech Emotion Recognition with Graph-Based Multimodal Fusion and Prosodic Features for the Speech Emotion Recognition in Naturalistic Conditions Challenge at Interspeech 20250
Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs0
Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian LawsCode0
Structured Pruning and Quantization for Learned Image CompressionCode0
CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer0
Quantization-based Bounds on the Wasserstein Metric0
Power-of-Two (PoT) Weights in Large Language Models (LLMs)0
Show:102550
← PrevPage 39 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified