Preventing Local Pitfalls in Vector Quantization via Optimal Transport Dec 19, 2024 Image Reconstruction Quantization
Code Code Available 25 Palu: Compressing KV-Cache with Low-Rank Projection Jul 30, 2024 GPU Quantization
Code Code Available 25 PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution Nov 26, 2024 Denoising Image Super-Resolution
Code Code Available 25 Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search Sep 16, 2024 Quantization
Code Code Available 25 On-Device Training Under 256KB Memory Jun 30, 2022 Lifelong learning Quantization
Code Code Available 25 NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks Oct 28, 2024 Quantization
Code Code Available 25 OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models Aug 25, 2023 Common Sense Reasoning Computational Efficiency
Code Code Available 25 Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies Jan 4, 2025 Edge-computing Knowledge Distillation
Code Code Available 25 ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers Sep 28, 2023 GPU Instruction Following
Code Code Available 25 MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension Nov 26, 2024 Language Modeling Language Modelling
Code Code Available 25 D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS Mar 7, 2025 Denoising Quantization
Code Code Available 25 Neural Network Compression Framework for fast model inference Feb 20, 2020 Binarization CPU
Code Code Available 25 OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting Jan 23, 2025 Language Modeling Language Modelling
Code Code Available 25 QuIP: 2-Bit Quantization of Large Language Models With Guarantees Jul 25, 2023 Quantization
Code Code Available 25 MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization Jul 14, 2025 2k Image Generation
Code Code Available 25 MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More Oct 8, 2024 Mixture-of-Experts Quantization
Code Code Available 25 MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization Jul 10, 2025 2k Quantization
Code Code Available 25 Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding Feb 3, 2025 Quantization
Code Code Available 25 Low-Rank Quantization-Aware Training for LLMs Jun 10, 2024 GPU parameter-efficient fine-tuning
Code Code Available 25 MAexp: A Generic Platform for RL-based Multi-Agent Exploration Apr 19, 2024 Diversity Multi-agent Reinforcement Learning
Code Code Available 25 MAUVE Scores for Generative Models: Theory and Practice Dec 30, 2022 Quantization
Code Code Available 25 LoQT: Low-Rank Adapters for Quantized Pretraining May 26, 2024 GPU Language Modeling
Code Code Available 25 Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 25 LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search Oct 24, 2024 Clustering GPU
Code Code Available 25 LLM-FP4: 4-Bit Floating-Point Quantized Transformers Oct 25, 2023 Common Sense Reasoning Quantization
Code Code Available 25 LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 25 Atom: Low-bit Quantization for Efficient and Accurate LLM Serving Oct 29, 2023 GPU Quantization
Code Code Available 25 LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS Nov 28, 2023 Knowledge Distillation NeRF
Code Code Available 25 Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search Jan 16, 2025 Quantization
Code Code Available 25 MobileQuant: Mobile-friendly Quantization for On-device Language Models Aug 25, 2024 Quantization
Code Code Available 25 LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Oct 12, 2023 Natural Language Understanding Quantization
Code Code Available 25 Compressing Large Language Models using Low Rank and Low Precision Decomposition May 29, 2024 Quantization
Code Code Available 25 Compact 3D Gaussian Representation for Radiance Field Nov 22, 2023 3DGS Model Compression
Code Code Available 25 A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Oct 2, 2024 Image Generation Quantization
Code Code Available 25 CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization Nov 30, 2023 3DGS NeRF
Code Code Available 25 Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation Nov 15, 2023 Quantization Recommendation Systems
Code Code Available 25 KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jul 1, 2024 Book summarization Quantization
Code Code Available 25 MBQ: Modality-Balanced Quantization for Large Vision-Language Models Dec 27, 2024 GPU Quantization
Code Code Available 25 MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training May 31, 2023 Language Modelling Quantization
Code Code Available 25 Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization Mar 19, 2024 Quantization
Code Code Available 25 LeanVec: Searching vectors faster by making them fit Dec 26, 2023 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 25 any4: Learned 4-bit Numeric Representation for LLMs Jul 7, 2025 GPU GSM8K
Code Code Available 25 AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution Apr 4, 2024 Image Super-Resolution Quantization
Code Code Available 25 Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Feb 16, 2024 Quantization
Code Code Available 25 INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Sep 25, 2024 GPU Quantization
Code Code Available 25 I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference Jul 4, 2022 Quantization
Code Code Available 25 Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 25 Model-Preserving Adaptive Rounding May 29, 2025 model Quantization
Code Code Available 25 An Empirical Study of Qwen3 Quantization May 4, 2025 Natural Language Understanding Quantization
Code Code Available 25 HAQ: Hardware-Aware Automated Quantization with Mixed Precision Nov 21, 2018 Quantization Reinforcement Learning
Code Code Available 25