decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points Apr 19, 2024 Quantization
Code Code Available 25 A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Oct 2, 2024 Image Generation Quantization
Code Code Available 25 Low-Rank Quantization-Aware Training for LLMs Jun 10, 2024 GPU parameter-efficient fine-tuning
Code Code Available 25 D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS Mar 7, 2025 Denoising Quantization
Code Code Available 25 AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution Apr 4, 2024 Image Super-Resolution Quantization
Code Code Available 25 QAQ: Quality Adaptive Quantization for LLM KV Cache Mar 7, 2024 Quantization Question Answering
Code Code Available 25 Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks Jan 6, 2025 Decoder Quantization
Code Code Available 25 LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS Nov 28, 2023 Knowledge Distillation NeRF
Code Code Available 25 LeanVec: Searching vectors faster by making them fit Dec 26, 2023 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 25 Binary Neural Networks: A Survey Mar 31, 2020 Binarization image-classification
Code Code Available 25 LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 25 Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models Apr 7, 2025 Math Quantization
Code Code Available 25 Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization Mar 19, 2024 Quantization
Code Code Available 25 BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains Feb 15, 2024 Few-Shot Learning Medical Question Answering
Code Code Available 25 KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jul 1, 2024 Book summarization Quantization
Code Code Available 25 LLM-FP4: 4-Bit Floating-Point Quantized Transformers Oct 25, 2023 Common Sense Reasoning Quantization
Code Code Available 25 BitNet: Scaling 1-bit Transformers for Large Language Models Oct 17, 2023 Language Modeling Language Modelling
Code Code Available 25 MAexp: A Generic Platform for RL-based Multi-Agent Exploration Apr 19, 2024 Diversity Multi-agent Reinforcement Learning
Code Code Available 25 BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation Jun 9, 2025 Quantization Vision-Language-Action
Code Code Available 25 On-Device Training Under 256KB Memory Jun 30, 2022 Lifelong learning Quantization
Code Code Available 25 Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Feb 16, 2024 Quantization
Code Code Available 25 any4: Learned 4-bit Numeric Representation for LLMs Jul 7, 2025 GPU GSM8K
Code Code Available 25 I-BERT: Integer-only BERT Quantization Jan 5, 2021 GPU Natural Language Inference
Code Code Available 25 Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 25 INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Sep 25, 2024 GPU Quantization
Code Code Available 25 Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Mar 27, 2025 Image Generation Quantization
Code Code Available 25 HAQ: Hardware-Aware Automated Quantization with Mixed Precision Nov 21, 2018 Quantization Reinforcement Learning
Code Code Available 25 hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices Mar 9, 2021 BIG-bench Machine Learning Diagnostic
Code Code Available 25 GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Apr 3, 2025 GPU Quantization
Code Code Available 25 Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Jun 17, 2024 image-classification Image Classification
Code Code Available 25 An Empirical Study of Qwen3 Quantization May 4, 2025 Natural Language Understanding Quantization
Code Code Available 25 GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 25 ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Mar 6, 2024 Quantization
Code Code Available 25 Similarity search in the blink of an eye with compressed indices Apr 7, 2023 Quantization
Code Code Available 25 GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval Jul 17, 2024 Decoder Image Enhancement
Code Code Available 25 Efficient LLM Inference on CPUs Nov 1, 2023 Quantization
Code Code Available 25 An empirical study of LLaMA3 quantization: from LLMs to MLLMs Apr 22, 2024 Language Modelling Large Language Model
Code Code Available 25 Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Apr 29, 2025 Quantization
Code Code Available 25 I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference Jul 4, 2022 Quantization
Code Code Available 25 Compressing Large Language Models using Low Rank and Low Precision Decomposition May 29, 2024 Quantization
Code Code Available 25 AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing Jun 23, 2025 Neural Architecture Search Quantization
Code Code Available 25 From Tiny Machine Learning to Tiny Deep Learning: A Survey Jun 21, 2025 AutoML Model Optimization
Code Code Available 25 GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting Jan 26, 2025 Quantization
Code Code Available 25 FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching May 26, 2025 Quantization Speech Enhancement
Code Code Available 25 Accurate LoRA-Finetuning Quantization of LLMs via Information Retention Feb 8, 2024 MMLU Quantization
Code Code Available 25 FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference Jan 13, 2021 Code Generation Deep Learning
Code Code Available 25 Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 25 GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM Mar 8, 2024 Quantization
Code Code Available 25 CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification Mar 14, 2024 Classification Crowd Counting
Code Code Available 25 CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization Nov 30, 2023 3DGS NeRF
Code Code Available 25