SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 601650 of 4925 papers

TitleStatusHype
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution NetworksCode1
Dataset Quantization with Active Learning based Adaptive SamplingCode1
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA AccelerationCode1
Data-Free Network Quantization With Adversarial Knowledge DistillationCode1
Learning to Groove with Inverse Sequence TransformationsCode1
Lexico: Extreme KV Cache Compression via Sparse Coding over Universal DictionariesCode1
Convolutional Autoencoder-Based Phase Shift Feedback Compression for Intelligent Reflecting Surface-Assisted Wireless SystemsCode1
ConveRT: Efficient and Accurate Conversational Representations from TransformersCode1
Continuous Visual Autoregressive Generation via Score MaximizationCode1
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference TimeCode1
Lightweight Super-Resolution Head for Human Pose EstimationCode1
Benchmarking Quantized Neural Networks on FPGAs with FINNCode1
Learnable Lookup Table for Neural Network QuantizationCode1
Linear-Time Self Attention with Codeword Histogram for Efficient RecommendationCode1
A Memory Efficient Baseline for Open Domain Question AnsweringCode1
BAND-2k: Banding Artifact Noticeable Database for Banding Detection and Quality AssessmentCode1
Language Embedded 3D Gaussians for Open-Vocabulary Scene UnderstandingCode1
Context-aware Communication for Multi-agent Reinforcement LearningCode1
Continual Learning via Bit-Level Information PreservingCode1
Learned Step Size QuantizationCode1
Designing Large Foundation Models for Efficient Training and Inference: A SurveyCode1
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT CompressionCode1
Confounding Tradeoffs for Neural Network QuantizationCode1
BAGUA: Scaling up Distributed Learning with System RelaxationsCode1
Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data AugmentationCode1
Accurate KV Cache Quantization with Outlier Tokens TracingCode1
Conditional Coding and Variable Bitrate for Practical Learned Video CodingCode1
Lossy Image Compression with Quantized Hierarchical VAEsCode1
CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-ResolutionCode1
Bi3D: Stereo Depth Estimation via Binary ClassificationsCode1
A Benchmark for Gaussian Splatting Compression and Quality Assessment StudyCode1
BiDM: Pushing the Limit of Quantization for Diffusion ModelsCode1
DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two QuantizationCode1
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model FinetuningCode1
COMQ: A Backpropagation-Free Algorithm for Post-Training QuantizationCode1
ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank CompensationCode1
Compression with Bayesian Implicit Neural RepresentationsCode1
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion ModelsCode1
LaCo: Large Language Model Pruning via Layer CollapseCode1
Differentiable JPEG: The Devil is in the DetailsCode1
Learning Architectures for Binary NetworksCode1
L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient and Accurate Deep LearningCode1
Jumping through Local Minima: Quantization in the Loss Landscape of Vision TransformersCode1
BinaryHPE: 3D Human Pose and Shape Estimation via BinarizationCode1
JointSQ: Joint Sparsification-Quantization for Distributed LearningCode1
Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMsCode1
Compress Any Segment Anything Model (SAM)Code1
Joint Privacy Enhancement and Quantization in Federated LearningCode1
An Automatic Graph Construction Framework based on Large Language Models for RecommendationCode1
kANNolo: Sweet and Smooth Approximate k-Nearest Neighbors SearchCode1
Show:102550
← PrevPage 13 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified