SOTAVerified

Quantization

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Papers

Showing 20012050 of 4925 papers

TitleStatusHype
Universal Joint Source-Channel Coding for Modulation-Agnostic Semantic Communication0
Flattened one-bit stochastic gradient descent: compressed distributed optimization with controlled variance0
Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network0
The Effect of Quantization in Federated Learning: A Rényi Differential Privacy Perspective0
Properties that allow or prohibit transferability of adversarial attacks among quantized networksCode0
Neural Speech Coding for Real-time Communications using Constant Bitrate Scalar Quantization0
FDD Massive MIMO: How to Optimally Combine UL Pilot and Limited DL CSI Feedback?0
Goal-oriented compression for L_p-norm-type goal functions: Application to power consumption scheduling0
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling0
Post Training Quantization of Large Language Models with Microscaling Formats0
Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization0
Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection0
Compression-Realized Deep Structural Network for Video Quality Enhancement0
Characterizing the Accuracy -- Efficiency Trade-off of Low-rank Decomposition in Language Models0
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models0
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks0
Custom Gradient Estimators are Straight-Through Estimators in Disguise0
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization0
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision TransformerCode0
Compression-based Privacy Preservation for Distributed Nash Equilibrium Seeking in Aggregative Games0
Quantifying the Capabilities of LLMs across Scale and Precision0
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment0
DeltaKWS: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM0
Joint Discrete Precoding and RIS Optimization for RIS-Assisted MU-MIMO Communication Systems0
Efficient Text-driven Motion Generation via Latent Consistency TrainingCode0
Exploring Extreme Quantization in Spiking Language Models0
Three Quantization Regimes for ReLU Networks0
Lightweight Change Detection in Heterogeneous Remote Sensing Images with Online All-Integer Pruning Training0
Network reconstruction via the minimum description length principle0
Efficient Compression of Multitask Multilingual Speech Models0
Joint Sequential Fronthaul Quantization and Hardware Complexity Reduction in Uplink Cell-Free Massive MIMO Networks0
Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge DeploymentCode0
Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications0
When Quantization Affects Confidence of Large Language Models?Code0
Self-supervised Pre-training of Text RecognizersCode0
Investigating Automatic Scoring and Feedback using Large Language Models0
Transition Rate Scheduling for Quantization-Aware Training0
Quantized Context Based LIF Neurons for Recurrent Spiking Neural Networks in 45nm0
Enhancing Channel Estimation in Quantized Systems with a Generative Prior0
sDAC -- Semantic Digital Analog Converter for Semantic Communications0
MMGRec: Multimodal Generative Recommendation with Transformer Model0
How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training0
CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation0
AdaQAT: Adaptive Bit-Width Quantization-Aware Training0
CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture0
Latency-Distortion Tradeoffs in Communicating Classification Results over Noisy Channels0
FedMPQ: Secure and Communication-Efficient Federated Learning with Multi-codebook Product Quantization0
HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression0
A SER-based Device Selection Mechanism in Multi-bits Quantization Federated Learning0
EdgeFusion: On-Device Text-to-Image Generation0
Show:102550
← PrevPage 41 of 99Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FQ-ViT (ViT-L)Top-1 Accuracy (%)85.03Unverified
2FQ-ViT (ViT-B)Top-1 Accuracy (%)83.31Unverified
3FQ-ViT (Swin-B)Top-1 Accuracy (%)82.97Unverified
4FQ-ViT (Swin-S)Top-1 Accuracy (%)82.71Unverified
5FQ-ViT (DeiT-B)Top-1 Accuracy (%)81.2Unverified
6FQ-ViT (Swin-T)Top-1 Accuracy (%)80.51Unverified
7FQ-ViT (DeiT-S)Top-1 Accuracy (%)79.17Unverified
8Xception W8A8Top-1 Accuracy (%)78.97Unverified
9ADLIK-MO-ResNet50-W4A4Top-1 Accuracy (%)77.88Unverified
10ADLIK-MO-ResNet50-W3A4Top-1 Accuracy (%)77.34Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_3MAP160,327.04Unverified
2DTQMAP0.79Unverified
#ModelMetricClaimedVerifiedStatus
1OutEffHop-Bert_basePerplexity6.3Unverified
2OutEffHop-Bert_basePerplexity6.21Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy98.13Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy92.92Unverified
#ModelMetricClaimedVerifiedStatus
1SSD ResNet50 V1 FPN 640x640MAP34.3Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-495.13Unverified
#ModelMetricClaimedVerifiedStatus
1TAR @ FAR=1e-496.38Unverified
#ModelMetricClaimedVerifiedStatus
13DCNN_VIVA_5All84,809,664Unverified
#ModelMetricClaimedVerifiedStatus
1Accuracy99.8Unverified