AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model Mar 5, 2025 Instance Segmentation Quantization
— Unverified 0Sparse Meets Dense: Unified Generative Recommendations with Cascaded Sparse-Dense Representations Mar 4, 2025 Quantization Recommendation Systems
— Unverified 0BHViT: Binarized Hybrid Vision Transformer Mar 4, 2025 Binarization Quantization
Code Code Available 2Q&C: When Quantization Meets Cache in Efficient Image Generation Mar 4, 2025 Image Generation Quantization
Code Code Available 0BdSLW401: Transformer-Based Word-Level Bangla Sign Language Recognition Using Relative Quantization Encoding (RQE) Mar 4, 2025 Quantization Sign Language Recognition
— Unverified 0DILEMMA: Joint LLM Quantization and Distributed LLM Inference Over Edge Computing Systems Mar 3, 2025 Edge-computing Knowledge Distillation
— Unverified 0Regularization-based Framework for Quantization-, Fault- and Variability-Aware Training Mar 3, 2025 Quantization
— Unverified 0DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models Mar 3, 2025 Mixture-of-Experts Quantization
— Unverified 0Cauchy-Schwarz Regularizers Mar 3, 2025 Quantization
Code Code Available 0KurTail : Kurtosis-based LLM Quantization Mar 3, 2025 GPU Language Modeling
— Unverified 0Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text Mar 3, 2025 Image Generation Quantization
— Unverified 0RSQ: Learning from Important Tokens Leads to Better Quantized LLMs Mar 3, 2025 Quantization
Code Code Available 1Patient-Level Anatomy Meets Scanning-Level Physics: Personalized Federated Low-Dose CT Denoising Empowered by Large Language Model Mar 2, 2025 Anatomy Denoising
Code Code Available 0MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations Mar 2, 2025 image-classification Image Classification
— Unverified 0Towards Lossless Implicit Neural Representation via Bit Plane Decomposition Feb 28, 2025 Image Compression Quantization
Code Code Available 1Strong Solutions and Quantization-Based Numerical Schemes for a Class of Non-Markovian Volatility Models Feb 28, 2025 Quantization
— Unverified 0Oscillation-Reduced MXFP4 Training for Vision Transformers Feb 28, 2025 GPU Quantization
Code Code Available 1UniTok: A Unified Tokenizer for Visual Generation and Understanding Feb 27, 2025 Quantization
Code Code Available 4Speculative Decoding and Beyond: An In-Depth Review of Techniques Feb 27, 2025 Quantization
— Unverified 0Transformer-Based Nonlinear Transform Coding for Multi-Rate CSI Compression in MIMO-OFDM Systems Feb 27, 2025 Image Compression Quantization
— Unverified 0Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models Feb 27, 2025 Knowledge Distillation Model Compression
— Unverified 0HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration Feb 27, 2025 Quantization
— Unverified 0On the Privacy-Preserving Properties of Spiking Neural Networks with Unique Surrogate Gradients and Quantization Levels Feb 25, 2025 Privacy Preserving Quantization
— Unverified 0Memory-Free and Parallel Computation for Quantized Spiking Neural Networks Feb 25, 2025 Computational Efficiency Quantization
— Unverified 0Compressing Language Models for Specialized Domains Feb 25, 2025 Quantization
— Unverified 0Task-Driven Semantic Quantization and Imitation Learning for Goal-Oriented Communications Feb 25, 2025 Imitation Learning Quantization
— Unverified 0Unbiased and Sign Compression in Distributed Learning: Comparing Noise Resilience via SDEs Feb 24, 2025 Distributed Optimization Language Modeling
— Unverified 0Compression Scaling Laws:Unifying Sparsity and Quantization Feb 23, 2025 Quantization
— Unverified 0Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification Feb 23, 2025 Classification Inference Optimization
— Unverified 0Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration Feb 23, 2025 3DGS 3D Semantic Segmentation
— Unverified 0Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression Feb 23, 2025 Efficient Neural Network Quantization
Code Code Available 1A 2-bit Wideband 5G mm-Wave RIS with Low Side Lobe Levels and no Quantization Lobe Feb 22, 2025 Quantization
— Unverified 0Verification of Bit-Flip Attacks against Quantized Neural Networks Feb 22, 2025 Neural Network Security Quantization
— Unverified 0Speech Enhancement Using Continuous Embeddings of Neural Audio Codec Feb 22, 2025 Quantization Speech Enhancement
— Unverified 0Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection Feb 21, 2025 3D Object Detection Autonomous Driving
— Unverified 0When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models Feb 21, 2025 Model Compression Quantization
— Unverified 0CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution Feb 21, 2025 Image Super-Resolution Quantization
Code Code Available 1SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention Feb 21, 2025 Quantization
— Unverified 0Exact Recovery of Sparse Binary Vectors from Generalized Linear Measurements Feb 21, 2025 2k Quantization
— Unverified 0FD-LSCIC: Frequency Decomposition-based Learned Screen Content Image Compression Feb 21, 2025 Image Compression MS-SSIM
— Unverified 0Interleaved Block-based Learned Image Compression with Feature Enhancement and Quantization Error Compensation Feb 21, 2025 Image Compression MS-SSIM
— Unverified 0Hardware-Friendly Static Quantization Method for Video Diffusion Transformers Feb 20, 2025 Quantization Video Generation
— Unverified 0More for Keys, Less for Values: Adaptive KV Cache Quantization Feb 20, 2025 Quantization
— Unverified 0Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs Feb 20, 2025 Quantization
Code Code Available 3Efficient AI in Practice: Training and Deployment of Efficient LLMs for Industry Applications Feb 20, 2025 Knowledge Distillation Model Compression
— Unverified 0A General Error-Theoretical Analysis Framework for Constructing Compression Strategies Feb 19, 2025 Quantization
— Unverified 0Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Feb 19, 2025 GPU Quantization
Code Code Available 2Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models Feb 18, 2025 Quantization
Code Code Available 0A^2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization Feb 18, 2025 CPU Position
— Unverified 0Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis Feb 18, 2025 Benchmarking Mamba
Code Code Available 0