SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 12511300 of 4925 papers

TitleStatusHype
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss0
Clustering with Bregman Divergences: an Asymptotic Analysis0
Approximately Invertible Neural Network for Learned Image Compression0
Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things0
Clustering-Based Evolutionary Federated Multiobjective Optimization and Learning0
Approximate DCT and Quantization Techniques for Energy-Constrained Image Sensors0
Cluster-Based Cooperative Digital Over-the-Air Aggregation for Wireless Federated Edge Learning0
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding0
1-Bit Compressive Sensing for Efficient Federated Learning Over the Air0
Efficient Compression of Multitask Multilingual Speech Models0
Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression0
Accelerating Deep Learning with Dynamic Data Pruning0
CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization0
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores0
Adaptive quantization with mixed-precision based on low-cost proxy0
2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search0
Efficient Asynchronous Federated Learning with Sparsification and Quantization0
A Post-coder Feedback Approach to Overcome Training Asymmetry in MIMO-TDD0
Click-through Rate Prediction with Auto-Quantized Contrastive Learning0
Adaptive Quantization Resolution and Power Control for Federated Learning over Cell-free Networks0
Classification Accuracy Improvement for Neuromorphic Computing Systems with One-level Precision Synapses0
Class-based Quantization for Neural Networks0
Efficient ANN-SNN Conversion with Error Compensation Learning0
Apollo-Forecast: Overcoming Aliasing and Inference Speed Challenges in Language Models for Time Series Forecasting0
CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer0
Adaptive Quantization of Neural Networks0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime0
A Planck Radiation and Quantization Scheme for Human Cognition and Language0
Choose Your Model Size: Any Compression by a Single Gradient Descent0
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning0
Efficient and Workload-Aware LLM Serving via Runtime Layer Swapping and KV Cache Resizing0
Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost0
CHIME: A Compressive Framework for Holistic Interest Modeling0
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models0
A Picture is Worth a Billion Bits: Real-Time Image Reconstruction from Dense Binary Pixels0
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge0
Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models0
Adaptive Quantization for Key Generation in Low-Power Wide-Area Networks0
Accelerating Deep Learning Inference via Freezing0
Characterizing the Accuracy -- Efficiency Trade-off of Low-rank Decomposition in Language Models0
Characterizing Coherent Integrated Photonic Neural Networks under Imperfections0
APG-MOS: Auditory Perception Guided-MOS Predictor for Synthetic Speech0
Characterization of the frequency response of channel-interleaved photonic ADCs based on the optical time-division demultiplexer0
Adaptive Quantization for Deep Neural Network0
An approach to optimize inference of the DIART speaker diarization pipeline0
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference0
A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications0
Characterising Bias in Compressed Models0
Show:102550
← PrevPage 26 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified