SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 701750 of 4925 papers

TitleStatusHype
Fuzzy Norm-Explicit Product Quantization for Recommender Systems0
SizeGS: Size-aware Compression of 3D Gaussians with Hierarchical Mixed Precision Quantization0
Efficient Distributed Training through Gradient Compression with Sparsification and Quantization Techniques0
Sensor Selection and Distributed Quantization for Energy Efficiency in Massive MTC0
Error Feedback Approach for Quantization Noise Reduction of Distributed Graph Filters0
ULMRec: User-centric Large Language Model for Sequential Recommendation0
Temporally Compressed 3D Gaussian Splatting for Dynamic ScenesCode1
GAQAT: gradient-adaptive quantization-aware training for domain generalization0
Trimming Down Large Spiking Vision Transformers via Heterogeneous Quantization Search0
APOLLO: SGD-like Memory, AdamW-level PerformanceCode3
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization0
Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification Task0
QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint VideosCode2
Prompting Large Language Models for Clinical Temporal Relation Extraction0
Evaluating Single Event Upsets in Deep Neural Networks for Semantic Segmentation: an embedded system perspectiveCode0
FlashAttention on a Napkin: A Diagrammatic Approach to Deep Learning IO-Awareness0
Unifying KV Cache Compression for Large Language Models with LeanKV0
Mixed-Precision Quantization: Make the Best Use of Bits Where They Matter Most0
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and GenerationCode3
Designing DNNs for a trade-off between robustness and processing performance in embedded devices0
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models0
Robust Precoding for Multi-User Visible Light Communications with Quantized Channel Information0
3D representation in 512-Byte:Variational tokenizer is the key for autoregressive 3D generation0
Lean classical-quantum hybrid neural network model for image classification0
Scaling Image Tokenizers with Grouped Spherical QuantizationCode0
Taming Scalable Visual Tokenizer for Autoregressive Image GenerationCode4
CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs0
Optimizing Domain-Specific Image Retrieval: A Benchmark of FAISS and Annoy with Fine-Tuned Features0
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification0
XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive GenerationCode3
Reducing Inference Energy Consumption Using Dual Complementary CNNsCode0
Improving Detail in Pluralistic Image Inpainting with Feature DequantizationCode1
RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model AccuracyCode0
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control0
DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined RotationCode1
A Wave is Worth 100 Words: Investigating Cross-Domain Transferability in Time Series0
LAMBDA: Covering the Multimodal Critical Scenarios for Automated Driving Systems by Search Space Quantization0
Privacy-Preserving Orthogonal Aggregation for Guaranteeing Gender Fairness in Federated Recommendation0
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation0
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding0
Scaling Transformers for Low-Bitrate High-Quality Speech CodingCode3
Quantized Delta Weight Is Safety Keeper0
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads0
On the effectiveness of discrete representations in sparse mixture of experts0
FAMES: Fast Approximate Multiplier Substitution for Mixed-Precision Quantized DNNs--Down to 2 Bits!0
COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection0
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and QuantizationCode0
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens0
MotionLLaMA: A Unified Framework for Motion Synthesis and ComprehensionCode2
Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving0
Show:102550
← PrevPage 15 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified