SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 16511700 of 4925 papers

TitleStatusHype
Experimental results on palmvein-based personal recognition by multi-snapshot fusion of textural features0
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment0
Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks0
Exploiting Change Blindness for Video Coding: Perspectives from a Less Promising User Study0
Exploiting Intelligent Reflecting Surfaces in NOMA Networks: Joint Beamforming Optimization0
Exploiting Latent Properties to Optimize Neural Codecs0
Optimizing Learned Image Compression on Scalar and Entropy-Constraint Quantization0
Exploiting Modern Hardware for High-Dimensional Nearest Neighbor Search0
Exploiting Non-uniform Quantization for Enhanced ILC in Wideband Digital Pre-distortion0
Exploiting Offset-guided Network for Pose Estimation and Tracking0
Cognitive Coding of Speech0
A Probabilistic Reformulation Technique for Discrete RIS Optimization in Wireless Systems0
Dynamic Stashing Quantization for Efficient Transformer Training0
Exploration of Activation Fault Reliability in Quantized Systolic Array-Based DNN Accelerators0
Explore Cross-Codec Quality-Rate Convex Hulls Relation for Adaptive Streaming0
Explore the Potential of CNN Low Bit Training0
Exploring Automatic Gym Workouts Recognition Locally On Wearable Resource-Constrained Devices0
Collaborative Edge AI Inference over Cloud-RAN0
Exploring Extreme Quantization in Spiking Language Models0
Exploring FPGA designs for MX and beyond0
Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization0
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis0
Dynamic Signal Measurements Based on Quantized Data0
Collaborative Quantization Embeddings for Intra-Subject Prostate MR Image Registration0
Dynamic quantized consensus under DoS attacks: Towards a tight zooming-out factor0
Dynamic Quantized Consensus of General Linear Multi-agent Systems under Denial-of-Service Attacks0
Exploring Semantic Segmentation on the DCT Representation0
Blind-Adaptive Quantizers0
An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits0
Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising0
Post Training Quantization of Large Language Models with Microscaling Formats0
Extreme Compression for Pre-trained Transformers Made Simple and Efficient0
Dynamic Q&A of Clinical Documents with Large Language Models0
Extreme Image Compression using Fine-tuned VQGANs0
Dynamic Probabilistic Pruning: A general framework for hardware-constrained pruning at different granularities0
Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM0
Blending Low and High-Level Semantics of Time Series for Better Masked Time Series Generation0
Dynamic Predictive Sampling Analog to Digital Converter for Sparse Signal Sensing0
Face recognition using color local binary pattern from mutually independent color channels0
Factorized Visual Tokenization and Generation0
Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks0
FactorizeNet: Progressive Depth Factorization for Efficient Network Architecture Exploration Under Quantization Constraints0
Communication-Efficient Decentralized Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control0
False Detection (Positives and Negatives) in Object Detection0
ADaPTION: Toolbox and Benchmark for Training Convolutional Neural Networks with Reduced Numerical Precision Weights and Activation0
FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4bit-Compact Multilayer Perceptrons0
FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization0
FAQS: Communication-efficient Federate DNN Architecture and Quantization Co-Search for personalized Hardware-aware Preferences0
Fixed Point Quantization of Deep Convolutional Networks0
FlashAttention on a Napkin: A Diagrammatic Approach to Deep Learning IO-Awareness0
Show:102550
← PrevPage 34 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified