APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers Apr 3, 2025 Quantization
Code Code Available 1GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Apr 3, 2025 GPU Quantization
Code Code Available 2HPGN: Hybrid Priors-Guided Network for Compressed Low-Light Image Enhancement Apr 3, 2025 Image Enhancement Low-Light Image Enhancement
— Unverified 0Moment Quantization for Video Temporal Grounding Apr 3, 2025 Quantization Video Understanding
— Unverified 0LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi Apr 2, 2025 Computational Efficiency Quantization
— Unverified 0When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks Apr 2, 2025 Benchmarking Language Modeling
— Unverified 0MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Apr 1, 2025 Image Generation Image Reconstruction
Code Code Available 1QSViT: A Methodology for Quantizing Spiking Vision Transformers Apr 1, 2025 Quantization
— Unverified 0SQuat: Subspace-orthogonal KV Cache Quantization Mar 31, 2025 Quantization
— Unverified 0Model Hemorrhage and the Robustness Limits of Large Language Models Mar 31, 2025 Quantization
— Unverified 0Style Quantization for Data-Efficient GAN Training Mar 31, 2025 Navigate Quantization
— Unverified 0Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM Inference Mar 30, 2025 GPU Quantization
— Unverified 0NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations Mar 29, 2025 3DGS NeRF
— Unverified 0MCRB for Parameter Estimation from One-Bit Quantized and Oversampled Measurements Mar 28, 2025 Direction of Arrival Estimation parameter estimation
— Unverified 0Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models Mar 28, 2025 MMLU Quantization
Code Code Available 2Make Some Noise: Towards LLM audio reasoning and generation using sound tokens Mar 28, 2025 Audio Generation Quantization
— Unverified 0Long-Tail Crisis in Nearest Neighbor Language Models Mar 28, 2025 Language Modeling Language Modelling
— Unverified 0A Refined Analysis of Massive Activations in LLMs Mar 28, 2025 Quantization
Code Code Available 1Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration Mar 27, 2025 Computational Efficiency Image Restoration
— Unverified 0Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Mar 27, 2025 Image Generation Quantization
Code Code Available 2HOT: Hadamard-based Optimized Training Mar 27, 2025 Quantization
Code Code Available 0MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness Mar 27, 2025 Language Modeling Language Modelling
— Unverified 0VADMamba: Exploring State Space Models for Fast Video Anomaly Detection Mar 27, 2025 Anomaly Detection Computational Efficiency
Code Code Available 1A 71.2-μW Speech Recognition Accelerator with Recurrent Spiking Neural Network Mar 27, 2025 Quantization speech-recognition
— Unverified 0MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation Mar 26, 2025 3D Generation Denoising
— Unverified 0QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition Mar 25, 2025 parameter-efficient fine-tuning Quantization
Code Code Available 0SINR: Sparsity Driven Compressed Implicit Neural Representations Mar 25, 2025 Quantization
— Unverified 0GENIUS: A Generative Framework for Universal Multimodal Search Mar 25, 2025 Information Retrieval Quantization
Code Code Available 2LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation Mar 25, 2025 Code Completion Language Modeling
Code Code Available 1QSID-MPC: Model Predictive Control with System Identification from Quantized Data Mar 24, 2025 Model Predictive Control Quantization
— Unverified 0FFN Fusion: Rethinking Sequential Computation in Large Language Models Mar 24, 2025 Quantization
— Unverified 0GranQ: Granular Zero-Shot Quantization with Channel-Wise Activation Scaling in QAT Mar 24, 2025 Neural Network Compression Quantization
— Unverified 0Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization Mar 24, 2025 GPU Large Language Model
— Unverified 0BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache Mar 24, 2025 Computational Efficiency GPU
Code Code Available 24DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video Mar 24, 2025 3DGS Quantization
— Unverified 0Energy-Aware LLMs: A step towards sustainable AI for downstream applications Mar 22, 2025 Quantization
— Unverified 0Improving Quantization with Post-Training Model Expansion Mar 21, 2025 Large Language Model model
— Unverified 0Variance Control via Weight Rescaling in LLM Pre-training Mar 21, 2025 Language Modeling Language Modelling
Code Code Available 0QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge Mar 20, 2025 Depth Estimation Monocular Depth Estimation
Code Code Available 1Neural Networks: According to the Principles of Grassmann Algebra Mar 20, 2025 Quantization
— Unverified 0Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Mar 20, 2025 Quantization
— Unverified 0Learning Linear Block Codes with Gradient Quantization Mar 20, 2025 Decoder Quantization
— Unverified 0SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs Mar 20, 2025 CPU GPU
— Unverified 0Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction Mar 20, 2025 Image Generation Language Modeling
— Unverified 0Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Mar 20, 2025 Quantization
— Unverified 0LeanTTA: A Backpropagation-Free and Stateless Approach to Quantized Test-Time Adaptation on Edge Devices Mar 20, 2025 Quantization Test-time Adaptation
— Unverified 0FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers Mar 19, 2025 Image Generation Quantization
Code Code Available 0PARQ: Piecewise-Affine Regularized Quantization Mar 19, 2025 Quantization
— Unverified 0RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning Mar 19, 2025 Federated Learning Quantization
— Unverified 0Natural Quantization of Neural Networks Mar 19, 2025 Quantization
Code Code Available 0