SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 22012250 of 4925 papers

TitleStatusHype
Reducing Communication for Split Learning by Randomized Top-k Sparsification0
SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics0
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes0
LLM-QAT: Data-Free Quantization Aware Training for Large Language ModelsCode3
BRICS: Bi-level feature Representation of Image CollectionS0
A Transfer Learning and Explainable Solution to Detect mpox from Smartphones imagesCode0
Reversible Quantization Index Modulation for Static Deep Neural Network Watermarking0
Disentanglement via Latent QuantizationCode1
Examining the Role and Limits of Batchnorm Optimization to Mitigate Diverse Hardware-noise in In-memory Computing0
Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals0
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time0
2-bit Conformer quantization for automatic speech recognition0
PQA: Exploring the Potential of Product Quantization in DNN Hardware AccelerationCode0
NVTC: Nonlinear Vector Transform CodingCode1
KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range MultilaterationCode1
RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models0
Just CHOP: Embarrassingly Simple LLM Compression0
BinaryViT: Towards Efficient and Accurate Binary Vision Transformers0
QLoRA: Efficient Finetuning of Quantized LLMsCode6
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image GenerationCode1
Adversarial Defenses via Vector Quantization0
Downlink Clustering-Based Scheduling of IRS-Assisted Communications With Reconfiguration Constraints0
Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization0
Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML0
Differential Privacy with Random Projections and Sign Random Projections0
TinyissimoYOLO: A Quantized, Low-Memory Footprint, TinyML Object Detection Network for Low Power Microcontrollers0
Digital-SC: Digital Semantic Communication with Adaptive Network Split and Learned Non-Linear Quantization0
TSPTQ-ViT: Two-scaled post-training quantization for vision transformer0
Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference PipelineCode1
Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study0
FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization0
Atomic Anatomy of Low-Inertia Power Systems0
Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models0
Bi-ViT: Pushing the Limit of Vision Transformer Quantization0
Two-Bit RIS-Aided Communications at 3.5GHz: Some Insights from the Measurement Results Under Multiple Practical Scenes0
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector QuantizationCode1
ReTAG: Reasoning Aware Table to Analytic Text Generation0
PTQD: Accurate Post-Training Quantization for Diffusion ModelsCode1
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization0
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture GenerationCode1
Q-SHED: Distributed Optimization at the Edge via Hessian Eigenvectors Quantization0
DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition0
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt0
MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural NetworksCode0
Component Training of Turbo Autoencoders0
Fast Inference of Tree Ensembles on ARM Devices0
Task-Oriented Communication Design at Scale0
Designing Discontinuities0
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks0
Federated TD Learning over Finite-Rate Erasure Channels: Linear Speedup under Markovian Sampling0
Show:102550
← PrevPage 45 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified