SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 12761300 of 4925 papers

TitleStatusHype
CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer0
Adaptive Quantization of Neural Networks0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime0
A Planck Radiation and Quantization Scheme for Human Cognition and Language0
Choose Your Model Size: Any Compression by a Single Gradient Descent0
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning0
Efficient and Workload-Aware LLM Serving via Runtime Layer Swapping and KV Cache Resizing0
Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost0
CHIME: A Compressive Framework for Holistic Interest Modeling0
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models0
A Picture is Worth a Billion Bits: Real-Time Image Reconstruction from Dense Binary Pixels0
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge0
Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models0
Adaptive Quantization for Key Generation in Low-Power Wide-Area Networks0
Accelerating Deep Learning Inference via Freezing0
Characterizing the Accuracy -- Efficiency Trade-off of Low-rank Decomposition in Language Models0
Characterizing Coherent Integrated Photonic Neural Networks under Imperfections0
APG-MOS: Auditory Perception Guided-MOS Predictor for Synthetic Speech0
Characterization of the frequency response of channel-interleaved photonic ADCs based on the optical time-division demultiplexer0
Adaptive Quantization for Deep Neural Network0
An approach to optimize inference of the DIART speaker diarization pipeline0
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference0
A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications0
Characterising Bias in Compressed Models0
Show:102550
← PrevPage 52 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified