SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 901950 of 4925 papers

TitleStatusHype
SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight CompressionCode1
PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization0
QEFT: Quantization for Efficient Fine-Tuning of LLMsCode0
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification0
DeltaDQ: Ultra-High Delta Compression for Fine-Tuned LLMs via Group-wise Dropout and Separate Quantization0
ACCEPT: Adaptive Codebook for Composite and Efficient Prompt TuningCode0
M^2-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization0
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation0
Scalable Representation Learning for Multimodal Tabular Transactions0
Q-VLM: Post-training Quantization for Large Vision-Language ModelsCode2
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete DiffusionCode0
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image AnimationCode7
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression0
QuAILoRA: Quantization-Aware Initialization for LoRA0
Perceptual Quality Assessment of Trisoup-Lifting Encoded 3D Point CloudsCode0
Scaling Laws for Mixed quantization in Large Language Models0
JPEG Inspired Deep LearningCode0
Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-training0
QERA: an Analytical Framework for Quantization Error Reconstruction0
Accelerating Error Correction Code TransformersCode0
QT-DoG: Quantization-aware Training for Domain GeneralizationCode1
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains MoreCode2
Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression0
Restructuring Vector Quantization with the Rotation TrickCode4
Variable Bitrate Residual Vector Quantization for Audio Coding0
Integrated Encoding and Quantization to Enhance Quanvolutional Neural NetworksCode0
Designing a Classifier for Active Fire Detection from Multispectral Satellite Imagery Using Neural Architecture Search0
Variable Resolution Pixel Quantization for Low Power Machine Vision Application on Edge0
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models QuantizationCode2
Continuous Approximations for Improving Quantization Aware Training of LLMs0
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis0
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms0
MIMO Detection with Spatial Sigma-Delta ADCs: A Variational Bayesian Approach0
Resource-aware Mixed-precision Quantization for Enhancing Deployability of Transformers for Time-series Forecasting on Embedded FPGAs0
Generative Semantic Communication for Text-to-Speech Synthesis0
ARB-LLM: Alternating Refined Binarizations for Large Language ModelsCode1
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector QuantizationCode1
EXAQ: Exponent Aware Quantization For LLMs AccelerationCode0
Lightweight Diffusion Models for Resource-Constrained Semantic CommunicationCode1
Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport0
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference AccelerationCode7
SEAL: SEmantic-Augmented Imitation Learning via Language Model0
Remember and Recall: Associative-Memory-based Trajectory Prediction0
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image GenerationCode2
Restorative Speech Enhancement: A Progressive Approach Using SE and Codec Modules0
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade DevicesCode1
ImageFolder: Autoregressive Image Generation with Folded TokensCode3
Getting Free Bits Back from Rotational Symmetries in LLMs0
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging0
Deep activity propagation via weight initialization in spiking neural networks0
Show:102550
← PrevPage 19 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified