Visual Autoregressive Modeling for Image Super-Resolution Jan 31, 2025 Image Super-Resolution Quantization
Code Code Available 2GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting Jan 26, 2025 Quantization
Code Code Available 2OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting Jan 23, 2025 Language Modeling Language Modelling
Code Code Available 2Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search Jan 16, 2025 Quantization
Code Code Available 2Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks Jan 6, 2025 Decoder Quantization
Code Code Available 2Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies Jan 4, 2025 Edge-computing Knowledge Distillation
Code Code Available 2MBQ: Modality-Balanced Quantization for Large Vision-Language Models Dec 27, 2024 GPU Quantization
Code Code Available 2Preventing Local Pitfalls in Vector Quantization via Optimal Transport Dec 19, 2024 Image Reconstruction Quantization
Code Code Available 2QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos Dec 5, 2024 Attribute Quantization
Code Code Available 2MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension Nov 26, 2024 Language Modeling Language Modelling
Code Code Available 2PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution Nov 26, 2024 Denoising Image Super-Resolution
Code Code Available 2Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency Nov 25, 2024 Quantization Video Restoration
Code Code Available 2Quantized symbolic time series approximation Nov 20, 2024 Anomaly Detection Astronomy
Code Code Available 2SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search Nov 19, 2024 Quantization Re-Ranking
Code Code Available 2The Super Weight in Large Language Models Nov 11, 2024 Language Modeling Language Modelling
Code Code Available 2Scaling Laws for Precision Nov 7, 2024 Quantization
Code Code Available 2NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks Oct 28, 2024 Quantization
Code Code Available 2LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search Oct 24, 2024 Clustering GPU
Code Code Available 2SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Oct 17, 2024 Quantization
Code Code Available 2Quamba: A Post-Training Quantization Recipe for Selective State Space Models Oct 17, 2024 Computational Efficiency Mamba
Code Code Available 2When Attention Sink Emerges in Language Models: An Empirical View Oct 14, 2024 Quantization
Code Code Available 2Q-VLM: Post-training Quantization for Large Vision-Language Models Oct 10, 2024 Language Modeling Language Modelling
Code Code Available 2MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More Oct 8, 2024 Mixture-of-Experts Quantization
Code Code Available 2PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization Oct 7, 2024 Common Sense Reasoning Quantization
Code Code Available 2A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Oct 2, 2024 Image Generation Quantization
Code Code Available 2INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Sep 25, 2024 GPU Quantization
Code Code Available 2Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search Sep 16, 2024 Quantization
Code Code Available 2S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training Sep 13, 2024 Quantization
Code Code Available 2Training-Free Activation Sparsity in Large Language Models Aug 26, 2024 Quantization
Code Code Available 2MobileQuant: Mobile-friendly Quantization for On-device Language Models Aug 25, 2024 Quantization
Code Code Available 2Efficient Autoregressive Audio Modeling via Next-Scale Prediction Aug 16, 2024 Audio Generation FAD
Code Code Available 2Palu: Compressing KV-Cache with Low-Rank Projection Jul 30, 2024 GPU Quantization
Code Code Available 2Temporal Feature Matters: A Framework for Diffusion Model Quantization Jul 28, 2024 Denoising Image Generation
Code Code Available 2Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale Jul 17, 2024 GPU LAMBADA
Code Code Available 2GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval Jul 17, 2024 Decoder Image Enhancement
Code Code Available 2KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jul 1, 2024 Book summarization Quantization
Code Code Available 2Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers Jun 25, 2024 Image Generation Model Compression
Code Code Available 2EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting Jun 22, 2024 Language Modeling Language Modelling
Code Code Available 2Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Jun 17, 2024 image-classification Image Classification
Code Code Available 2QQQ: Quality Quattuor-Bit Quantization for Large Language Models Jun 14, 2024 Quantization
Code Code Available 2Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models Jun 13, 2024 Math Quantization
Code Code Available 2Low-Rank Quantization-Aware Training for LLMs Jun 10, 2024 GPU parameter-efficient fine-tuning
Code Code Available 2DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs Jun 3, 2024 Management Quantization
Code Code Available 2Compressing Large Language Models using Low Rank and Low Precision Decomposition May 29, 2024 Quantization
Code Code Available 2LoQT: Low-Rank Adapters for Quantized Pretraining May 26, 2024 GPU Language Modeling
Code Code Available 2TerDiT: Ternary Diffusion Models with Transformers May 23, 2024 Image Generation Quantization
Code Code Available 2SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models May 23, 2024 Natural Language Understanding Quantization
Code Code Available 2RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search May 21, 2024 Quantization
Code Code Available 2Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 2PTQ4SAM: Post-Training Quantization for Segment Anything May 6, 2024 Instance Segmentation object-detection
Code Code Available 2