SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 29012950 of 4925 papers

TitleStatusHype
ReTAG: Reasoning Aware Table to Analytic Text Generation0
Stochastic Gradient Langevin Dynamics Based on Quantization with Increasing Resolution0
Stochastic Hybrid Combining Design for Quantized Massive MIMO Systems0
Stochastic Learning Equation using Monotone Increasing Resolution of Quantization0
Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks0
Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks0
Stochastic-Sign SGD for Federated Learning with Theoretical Guarantees0
Stopping Rules for Bag-of-Words Image Search and Its Application in Appearance-Based Localization0
STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM0
Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks0
Strategizing against Q-learners: A Control-theoretical Approach0
Streaming Parrotron for on-device speech-to-speech conversion0
Streamlining Tensor and Network Pruning in PyTorch0
Strong Solutions and Quantization-Based Numerical Schemes for a Class of Non-Markovian Volatility Models0
Structural and Statistical Texture Knowledge Distillation for Semantic Segmentation0
Structural Latency Perturbation in Large Language Models Through Recursive State Induction0
Structured adaptive and random spinners for fast machine learning computations0
Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation0
Structured Compression by Weight Encryption for Unstructured Pruning and Quantization0
Neural Language of Thought Models0
Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection0
Studying the Interplay between Information Loss and Operation Loss in Representations for Classification0
Study of Encoder-Decoder Architectures for Code-Mix Search Query Translation0
Study of Energy-Efficient Distributed RLS-based Learning with Coarsely Quantized Signals0
Style Quantization for Data-Efficient GAN Training0
Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition0
Sub-8-bit quantization for on-device speech recognition: a regularization-free approach0
Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded Chipsets0
Subgraph Stationary Hardware-Software Inference Co-Design0
SUBIC: A supervised, structured binary code for image search0
Subjective Quality Database and Objective Study of Compressed Point Clouds With 6DoF Head-Mounted Display0
Sublinear quantum algorithms for training linear and kernel-based classifiers0
Subspace Robust Wasserstein Distances0
Subtensor Quantization for Mobilenets0
Succinct Compression: Near-Optimal and Lossless Compression of Deep Neural Networks during Inference Runtime0
Sum Rate Maximization in the Constant Envelope MIMO Downlink with the RZF Precoder0
Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization0
Super-relaxation of space-time-quantized ensemble of energy loads to curtail their synchronization after demand response perturbation0
Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images0
Supervised Deep Hashing for High-dimensional and Heterogeneous Case-based Reasoning0
Supervised Learning in the Presence of Concept Drift: A modelling framework0
Supervised Matrix Factorization for Cross-Modality Hashing0
Supervised Quantization for Similarity Search0
Support Recovery in Universal One-bit Compressed Sensing0
Survey of Quantization Techniques for On-Device Vision-based Crack Detection0
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency0
SUT System Description for Anti-Spoofing 2017 Challenge0
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention0
SVGformer: Representation Learning for Continuous Vector Graphics Using Transformers0
SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration0
Show:102550
← PrevPage 59 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified