SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 10011025 of 4925 papers

TitleStatusHype
Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to GiantCode0
Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor SearchCode2
LASERS: LAtent Space Encoding for Representations with Sparsity for Generative Modeling0
Forearm Ultrasound based Gesture Recognition on Edge0
Language Models and Retrieval Augmented Generation for Automated Structured Data Extraction from Diagnostic Reports0
MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation0
Improving Statistical Significance in Human Evaluation of Automatic Metrics via Soft Pairwise Accuracy0
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare0
Robust Training of Neural Networks at Arbitrary Precision and Sparsity0
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-trainingCode2
Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling0
Dequantization of a signal from two parallel quantized observations0
Efficient and Reliable Vector Similarity Search Using Asymmetric Encoding with NAND-Flash for Many-Class Few-Shot Learning0
DiTAS: Quantizing Diffusion Transformers via Enhanced Activation SmoothingCode1
Distributed Convolutional Neural Network Training on Mobile and Edge Clusters0
Adaptive Error-Bounded Hierarchical Matrices for Efficient Neural Network Compression0
STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM0
NVRC: Neural Video Representation Compression0
AgileIR: Memory-Efficient Group Shifted Windows Attention for Agile Image Restoration0
Rate-Constrained Quantization for Communication-Efficient Federated Learning0
Distributed Optimization with Finite Bit Adaptive Quantization for Efficient Communication and Precision Enhancement0
ECG Biometric Authentication Using Self-Supervised Learning for IoT Edge Sensors0
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech CodecCode3
Estimating the Completeness of Discrete Speech Units0
SGC-VQGAN: Towards Complex Scene Representation via Semantic Guided Clustering Codebook0
Show:102550
← PrevPage 41 of 197Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified