SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 31513200 of 4925 papers

TitleStatusHype
TwinDNN: A Tale of Two Deep Neural Networks0
Twin Network Augmentation: A Novel Training Strategy for Improved Spiking Neural Networks and Efficient Weight Quantization0
Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes0
Two Dimensional Array Imaging with Beam Steered Data0
Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation0
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models0
Two-layer Near-lossless HDR Coding with Backward Compatibility to JPEG0
Two-Stage Hashing for Fast Document Retrieval0
Two-stage iterative Procrustes match algorithm and its application for VQ-based speaker verification0
Two-Stage Learning for Uplink Channel Estimation in One-Bit Massive MIMO0
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model0
UDC: Unified DNAS for Compressible TinyML Models0
ULMRec: User-centric Large Language Model for Sequential Recommendation0
Ultra-Lightweight Speech Separation via Group Communication0
Ultra-low Latency Adaptive Local Binary Spiking Neural Network with Accuracy Loss Estimator0
Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA0
Ultra-low Power Deep Learning-based Monocular Relative Localization Onboard Nano-quadrotors0
Ultra-Low Precision 4-bit Training of Deep Neural Networks0
Ultra-low Precision Multiplication-free Training for Deep Neural Networks0
Unbiased and Sign Compression in Distributed Learning: Comparing Noise Resilience via SDEs0
Uncertainty-Aware Deep Video Compression with Ensembles0
Uncertainty Estimation in Multi-Agent Distributed Learning0
Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor0
Unconstrained Face Recognition using ASURF and Cloud-Forest Classifier optimized with VLAD0
Understanding Flatness in Generative Models: Its Role and Benefits0
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases0
Understanding the Difficulty of Low-Precision Post-Training Quantization for LLMs0
Understanding the Impact of Post-Training Quantization on Large Language Models0
Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks0
Understanding Unconventional Preprocessors in Deep Convolutional Neural Networks for Face Identification0
UniCode: Learning a Unified Codebook for Multimodal Large Language Models0
UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation0
Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization0
Unified Anomaly Detection methods on Edge Device using Knowledge Distillation and Quantization0
Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning0
Unified learning-based lossy and lossless JPEG recompression0
Unified Stochastic Framework for Neural Network Quantization and Pruning0
Uniform-Precision Neural Network Quantization via Neural Channel Expansion0
Unifying KV Cache Compression for Large Language Models with LeanKV0
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion0
UniHM: Universal Human Motion Generation with Object Interactions in Indoor Scenes0
UNIQ: Uniform Noise Injection for Non-Uniform Quantization of Neural Networks0
Universal Deep Neural Network Compression0
Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size0
Universal Joint Source-Channel Coding for Modulation-Agnostic Semantic Communication0
Universally Quantized Neural Compression0
Unleashing Dynamic Range and Resolution in Unlimited Sensing Framework via Novel Hardware0
Unlimited Sampling Radar: a Real-Time End-to-End Demonstrator0
Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales0
Enhancing Multimodal Unified Representations for Cross Modal Generalization0
Show:102550
← PrevPage 64 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified