SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 30013050 of 4925 papers

TitleStatusHype
Tetra-AML: Automatic Machine Learning via Tensor Networks0
Di^2Pose: Discrete Diffusion Model for Occluded 3D Human Pose Estimation0
Text me the data: Generating Ground Pressure Sequence from Textual Descriptions for HAR0
DoStoVoQ: Doubly Stochastic Voronoi Vector Quantization SGD for Federated Learning0
Texture CNN for Thermoelectric Metal Pipe Image Classification0
The Bach Doodle: Approachable music composition with machine learning at scale0
The Binary and Ternary Quantization Can Improve Feature Discrimination0
The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention0
The bottleneck and ceiling effects in quantized tracking control of heterogeneous multi-agent systems under DoS attacks0
The Bussgang Decomposition of Non-Linear Systems: Basic Theory and MIMO Extensions0
The Canonical Distortion Measure for Vector Quantization and Function Approximation0
The Convergence of Sparsified Gradient Methods0
The Cramer-Rao Bound for Signal Parameter Estimation from Quantized Data0
The Devil is in the Details: Simple Remedies for Image-to-LiDAR Representation Learning0
The effect of fatigue on the performance of online writer recognition0
The Effect of Quantization in Federated Learning: A Rényi Differential Privacy Perspective0
The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation0
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models0
The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs0
The Impact of Quantization on the Robustness of Transformer-based Text Classifiers0
The Interpretability of Codebooks in Model-Based Reinforcement Learning is Limited0
The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI0
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures0
The quantization error in a Self-Organizing Map as a contrast and colour specific indicator of single-pixel change in large random patterns0
The Rate-Distortion-Accuracy Tradeoff: JPEG Case Study0
The Sockeye 2 Neural Machine Translation Toolkit at AMTA 20200
The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic0
The Uniqueness of LLaMA3-70B Series with Per-Channel Quantization0
The Wavefunction of Continuous-Time Recurrent Neural Networks0
ThinK: Thinner Key Cache by Query-Driven Pruning0
Three Quantization Regimes for ReLU Networks0
Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability0
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors0
Time-Correlated Sparsification for Communication-Efficient Federated Learning0
Time regularization as a solution to mitigate quantization induced performance degradation0
Timestep-Aware Correction for Quantized Diffusion Models0
Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation0
TinyissimoYOLO: A Quantized, Low-Memory Footprint, TinyML Object Detection Network for Low Power Microcontrollers0
TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems0
TinyM^2Net: A Flexible System Algorithm Co-designed Multimodal Learning Framework for Tiny Devices0
TinyM^2Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment0
tinySNN: Towards Memory- and Energy-Efficient Spiking Neural Networks0
Tiny-VBF: Resource-Efficient Vision Transformer based Lightweight Beamformer for Ultrasound Single-Angle Plane Wave Imaging0
TinyVQA: Compact Multimodal Deep Neural Network for Visual Question Answering on Resource-Constrained Devices0
Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions0
TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models0
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference0
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis0
Topological Analysis for Detecting Anomalies (TADA) in Time Series0
Topologically Controlled Lossy Compression0
Show:102550
← PrevPage 61 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified