SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 20512100 of 4925 papers

TitleStatusHype
EdgeFusion: On-Device Text-to-Image Generation0
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory0
QGen: On the Ability to Generalize in Quantization Aware Training0
Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems0
Variational quantization for state space modelsCode0
Comprehensive Survey of Model Compression and Speed up for Vision Transformers0
Quantization of Large Language Models with an Overdetermined Basis0
Efficient and accurate neural field reconstruction using resistive memory0
TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models0
SNN4Agents: A Framework for Developing Energy-Efficient Embodied Spiking Neural Networks for Autonomous AgentsCode0
Bullion: A Column Store for Machine Learning0
Full-Duplex Beyond Self-Interference: The Unlimited Sensing Way0
Lossy Image Compression with Foundation Diffusion Models0
1-bit Quantized On-chip Hybrid Diffraction Neural Network Enabled by Authentic All-optical Fully-connected Architecture0
Frame Quantization of Neural Networks0
Edge-Efficient Deep Learning Models for Automatic Modulation Classification: A Performance Analysis0
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent LayersCode0
Differentiable Search for Finding Optimal Quantization Strategy0
Collaborative Edge AI Inference over Cloud-RAN0
Encoder-Quantization-Motion-based Video Quality Metrics0
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws0
Investigating the Impact of Quantization on Adversarial Robustness0
Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network AcceleratorsCode0
David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep EdgeCode0
Gull: A Generative Multifunctional Audio Codec0
Weakly Supervised Deep Hyperspherical Quantization for Image RetrievalCode0
Nanometer Scanning with Micrometer Sensing: Beating Quantization Constraints in Lissajous Trajectory Tracking0
What Happens When Small Is Made Smaller? Exploring the Impact of Compression on Small Data Pretrained Language Models0
Fine-Tuning, Quantization, and LLMs: Navigating Unintended Outcomes0
TinyVQA: Compact Multimodal Deep Neural Network for Visual Question Answering on Resource-Constrained Devices0
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation RegularizationCode0
DI-Retinex: Digital-Imaging Retinex Theory for Low-Light Image Enhancement0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models0
DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision Quantization0
NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation0
Minimize Quantization Output Error with Bias CompensationCode0
On the Effect of Quantization on Dynamic Mode Decomposition0
RefQSR: Reference-based Quantization for Image Super-Resolution Networks0
A Novel Audio Representation for Music Genre Identification in MIR0
Instance-Aware Group Quantization for Vision Transformers0
Towards Variable and Coordinated Holistic Co-Speech Motion Generation0
Accurate Block Quantization in LLMs with Outliers0
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs0
QNCD: Quantization Noise Correction for Diffusion ModelsCode0
Meta-Heuristic Fronthaul Bit Allocation for Cell-free Massive MIMO Systems0
Uncertainty-Aware Deep Video Compression with Ensembles0
Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose Sequence0
Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models0
Order of Compression: A Systematic and Optimal Sequence to Combinationally Compress CNN0
Show:102550
← PrevPage 42 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified