Binary Neural Networks: A Survey Mar 31, 2020 Binarization image-classification
Code Code Available 2Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 2PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization Oct 7, 2024 Common Sense Reasoning Quantization
Code Code Available 2Preventing Local Pitfalls in Vector Quantization via Optimal Transport Dec 19, 2024 Image Reconstruction Quantization
Code Code Available 2LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 2BitNet: Scaling 1-bit Transformers for Large Language Models Oct 17, 2023 Language Modeling Language Modelling
Code Code Available 2BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains Feb 15, 2024 Few-Shot Learning Medical Question Answering
Code Code Available 2LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS Nov 28, 2023 Knowledge Distillation NeRF
Code Code Available 2Binarized Neural Machine Translation Feb 9, 2023 Binarization Machine Translation
Code Code Available 2AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution Apr 4, 2024 Image Super-Resolution Quantization
Code Code Available 2Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization Mar 19, 2024 Quantization
Code Code Available 2KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jul 1, 2024 Book summarization Quantization
Code Code Available 2D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS Mar 7, 2025 Denoising Quantization
Code Code Available 2Quamba: A Post-Training Quantization Recipe for Selective State Space Models Oct 17, 2024 Computational Efficiency Mamba
Code Code Available 2AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval Apr 9, 2024 All Information Retrieval
Code Code Available 2LeanVec: Searching vectors faster by making them fit Dec 26, 2023 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 2Dataset Quantization Aug 21, 2023 Dataset Distillation object-detection
Code Code Available 2DETRPose: Real-time end-to-end transformer model for multi-person pose estimation Jun 16, 2025 2D Pose Estimation Decoder
Code Code Available 2BHViT: Binarized Hybrid Vision Transformer Mar 4, 2025 Binarization Quantization
Code Code Available 2QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference Feb 15, 2024 GPU Quantization
Code Code Available 2Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 2Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models Jun 13, 2024 Math Quantization
Code Code Available 2I-BERT: Integer-only BERT Quantization Jan 5, 2021 GPU Natural Language Inference
Code Code Available 2An empirical study of LLaMA3 quantization: from LLMs to MLLMs Apr 22, 2024 Language Modelling Large Language Model
Code Code Available 2hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices Mar 9, 2021 BIG-bench Machine Learning Diagnostic
Code Code Available 2HAQ: Hardware-Aware Automated Quantization with Mixed Precision Nov 21, 2018 Quantization Reinforcement Learning
Code Code Available 2Atom: Low-bit Quantization for Efficient and Accurate LLM Serving Oct 29, 2023 GPU Quantization
Code Code Available 2Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Mar 27, 2025 Image Generation Quantization
Code Code Available 2GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 2Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Jun 17, 2024 image-classification Image Classification
Code Code Available 2GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Apr 3, 2025 GPU Quantization
Code Code Available 2INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Sep 25, 2024 GPU Quantization
Code Code Available 2Similarity search in the blink of an eye with compressed indices Apr 7, 2023 Quantization
Code Code Available 2SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Oct 17, 2024 Quantization
Code Code Available 2GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting Jan 26, 2025 Quantization
Code Code Available 2GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM Mar 8, 2024 Quantization
Code Code Available 2From Tiny Machine Learning to Tiny Deep Learning: A Survey Jun 21, 2025 AutoML Model Optimization
Code Code Available 2Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale Jul 17, 2024 GPU LAMBADA
Code Code Available 2GENIUS: A Generative Framework for Universal Multimodal Search Mar 25, 2025 Information Retrieval Quantization
Code Code Available 2FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching May 26, 2025 Quantization Speech Enhancement
Code Code Available 2FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference Jan 13, 2021 Code Generation Deep Learning
Code Code Available 2A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Oct 2, 2024 Image Generation Quantization
Code Code Available 2GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval Jul 17, 2024 Decoder Image Enhancement
Code Code Available 2Evaluating Quantized Large Language Models Feb 28, 2024 Mamba Quantization
Code Code Available 2Accurate LoRA-Finetuning Quantization of LLMs via Information Retention Feb 8, 2024 MMLU Quantization
Code Code Available 2Efficient LLM Inference on CPUs Nov 1, 2023 Quantization
Code Code Available 2Fast convolutional neural networks on FPGAs with hls4ml Jan 13, 2021 Model Compression Quantization
Code Code Available 2SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search Nov 19, 2024 Quantization Re-Ranking
Code Code Available 2Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency Nov 25, 2024 Quantization Video Restoration
Code Code Available 2An Empirical Study of Qwen3 Quantization May 4, 2025 Natural Language Understanding Quantization
Code Code Available 2