Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients Jul 17, 2024 image-classification Image Classification
— Unverified 0StoX-Net: Stochastic Processing of Partial Sums for Efficient In-Memory Computing DNN Accelerators Jul 17, 2024 Quantization
Code Code Available 0GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval Jul 17, 2024 Decoder Image Enhancement
Code Code Available 2Mamba-PTQ: Outlier Channels in Recurrent Large Language Models Jul 17, 2024 Mamba Model Compression
— Unverified 0Rate-Distortion-Cognition Controllable Versatile Neural Image Compression Jul 16, 2024 Image Compression Image Reconstruction
— Unverified 0Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment Jul 16, 2024 Quantization Scheduling
— Unverified 0Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors Jul 16, 2024 GPU Neural Network Compression
— Unverified 0Exploring Quantization for Efficient Pre-Training of Transformer Language Models Jul 16, 2024 Language Modeling Language Modelling
Code Code Available 1Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models Jul 16, 2024 Quantization
Code Code Available 1NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks Jul 16, 2024 Quantization
Code Code Available 0QVD: Post-training Quantization for Video Diffusion Models Jul 16, 2024 Computational Efficiency Quantization
— Unverified 0LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices Jul 16, 2024 Quantization
Code Code Available 0Fast Matrix Multiplications for Lookup Table-Quantized LLMs Jul 15, 2024 Quantization
Code Code Available 3Quality Scalable Quantization Methodology for Deep Learning on Edge Jul 15, 2024 Deep Learning Edge-computing
— Unverified 0SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation Jul 15, 2024 Click-Through Rate Prediction Quantization
— Unverified 0Qwen2 Technical Report Jul 15, 2024 Arithmetic Reasoning GSM8K
Code Code Available 13Quantized Prompt for Efficient Generalization of Vision-Language Models Jul 15, 2024 General Knowledge Language Modelling
Code Code Available 0LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid Jul 14, 2024 GPU Language Modeling
— Unverified 0A Bag of Tricks for Scaling CPU-based Deep FFMs to more than 300m Predictions per Second Jul 14, 2024 Click-Through Rate Prediction CPU
— Unverified 0One-Bit MIMO Detection: From Global Maximum-Likelihood Detector to Amplitude Retrieval Approach Jul 13, 2024 Quantization Retrieval
— Unverified 0Semi-supervised 3D Object Detection with PatchTeacher and PillarMix Jul 13, 2024 3D Object Detection Data Augmentation
Code Code Available 0PSC: Posterior Sampling-Based Compression Jul 13, 2024 Decoder Image Compression
Code Code Available 1Optimization of DNN-based speaker verification model through efficient quantization technique Jul 12, 2024 Quantization Speaker Verification
— Unverified 0Accuracy is Not All You Need Jul 12, 2024 All Quantization
— Unverified 0On Exact Bit-level Reversible Transformers Without Changing Architectures Jul 12, 2024 image-classification Image Classification
Code Code Available 1Distributed Deep Reinforcement Learning Based Gradient Quantization for Federated Learning Enabled Vehicle Edge Computing Jul 11, 2024 Deep Reinforcement Learning Edge-computing
— Unverified 0ADMM Based Semi-Structured Pattern Pruning Framework For Transformer Jul 11, 2024 CoLA Quantization
— Unverified 0FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision Jul 11, 2024 GPU Quantization
Code Code Available 12Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Jul 11, 2024 Quantization
Code Code Available 5Autoregressive Speech Synthesis without Vector Quantization Jul 11, 2024 Audio Compression Diversity
— Unverified 0Applying generative neural networks for fast simulations of the ALICE (CERN) experiment Jul 10, 2024 Quantization
Code Code Available 0EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Jul 10, 2024 GPU Quantization
Code Code Available 3RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Jul 10, 2024 parameter-efficient fine-tuning Quantization
Code Code Available 1Dataset Quantization with Active Learning based Adaptive Sampling Jul 9, 2024 Active Learning Dataset Distillation
Code Code Available 1ERQ: Error Reduction for Post-Training Quantization of Vision Transformers Jul 9, 2024 Quantization regression
— Unverified 0CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens Jul 7, 2024 Language Modelling Large Language Model
Code Code Available 11Ternary Spike-based Neuromorphic Signal Processing System Jul 7, 2024 Quantization
— Unverified 0CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs Jul 7, 2024 Contrastive Learning object-detection
Code Code Available 1OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks Jul 7, 2024 Quantization
Code Code Available 1Integer-only Quantized Transformers for Embedded FPGA-based Time-series Forecasting in AIoT Jul 6, 2024 Quantization Time Series
— Unverified 0Quantizing YOLOv7: A Comprehensive Study Jul 6, 2024 Model Compression object-detection
— Unverified 0Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression Jul 6, 2024 Language Modeling Language Modelling
Code Code Available 0Balance of Number of Embedding and their Dimensions in Vector Quantization Jul 6, 2024 Quantization
— Unverified 0ZOBNN: Zero-Overhead Dependable Design of Binary Neural Networks with Deliberately Quantized Parameters Jul 6, 2024 Attribute Quantization
— Unverified 0Hybrid Receiver Design for Massive MIMO-OFDM with Low-Resolution ADCs and Oversampling Jul 5, 2024 Quantization
— Unverified 0SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking Jul 5, 2024 Language Modelling Large Language Model
Code Code Available 1Resource-Efficient Speech Quality Prediction through Quantization Aware Training and Binary Activation Maps Jul 5, 2024 Quantization
Code Code Available 0The Impact of Quantization and Pruning on Deep Reinforcement Learning Models Jul 5, 2024 Deep Reinforcement Learning Model Compression
— Unverified 0Joint Beamforming Design and Bit Allocation in Massive MIMO with Resolution-Adaptive ADCs Jul 4, 2024 Quantization
— Unverified 0Low-latency machine learning FPGA accelerator for multi-qubit-state discrimination Jul 4, 2024 Quantization
— Unverified 0