SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 21512200 of 4925 papers

TitleStatusHype
Adaptive Transmission for Distributed Detection in Energy Harvesting Wireless Sensor Networks0
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization0
Evaluating the Practicality of Learned Image Compression0
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection0
CNN inference acceleration using dictionary of centroids0
Evaluating Post-Training Compression in GANs using Locality-Sensitive Hashing0
CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture0
Adaptive Training of Random Mapping for Data Quantization0
EuclidNets: Combining hardware and architecture design for Efficient Inference and Training0
EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models0
CNN-based Analog CSI Feedback in FDD MIMO-OFDM Systems0
Estimation and Quantization of Expected Persistence Diagrams0
CNN Acceleration by Low-rank Approximation with Quantized Factors0
Approximate search with quantized sparse representations0
Estimating the Completeness of Discrete Speech Units0
CNN2Gate: Toward Designing a General Framework for Implementation of Convolutional Neural Networks on FPGA0
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA0
ESC-MVQ: End-to-End Semantic Communication With Multi-Codebook Vector Quantization0
Cluster Regularized Quantization for Deep Networks Compression0
Approximate Probabilistic Neural Networks with Gated Threshold Logic0
Adaptive Sample-space & Adaptive Probability coding: a neural-network based approach for compression0
eSampling: Energy Harvesting ADCs0
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs0
Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization0
Error Feedback Approach for Quantization Noise Reduction of Distributed Graph Filters0
Cluster Pruning: An Efficient Filter Pruning Method for Edge AI Vision Applications0
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss0
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization0
Error-aware Quantization through Noise Tempering0
Clustering with Bregman Divergences: an Asymptotic Analysis0
Approximately Invertible Neural Network for Learned Image Compression0
Adaptive Resource Allocation for Semantic Communication Networks0
Error Analysis of CORDIC Processor with FPGA Implementation0
ERQ: Error Reduction for Post-Training Quantization of Vision Transformers0
E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs0
Clustering-Based Evolutionary Federated Multiobjective Optimization and Learning0
Approximate DCT and Quantization Techniques for Energy-Constrained Image Sensors0
Cluster-Based Cooperative Digital Over-the-Air Aggregation for Wireless Federated Edge Learning0
EQ-Net: A Unified Deep Learning Framework for Log-Likelihood Ratio Estimation and Quantization0
ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning0
Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding0
Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things0
Accelerating Energy-Efficient Federated Learning in Cell-Free Networks with Adaptive Quantization0
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency0
EPIM: Efficient Processing-In-Memory Accelerators based on Epitome0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
Entropy optimized semi-supervised decomposed vector-quantized variational autoencoder model based on transfer learning for multiclass text classification and generation0
CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization0
Entropy-Driven Mixed-Precision Quantization for Deep Network Design0
Entropy Coding Improvement for Low-complexity Compressive Auto-encoders0
Show:102550
← PrevPage 44 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified