LLM-FP4: 4-Bit Floating-Point Quantized Transformers Oct 25, 2023 Common Sense Reasoning Quantization
Code Code Available 25 D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS Mar 7, 2025 Denoising Quantization
Code Code Available 25 Low-Rank Quantization-Aware Training for LLMs Jun 10, 2024 GPU parameter-efficient fine-tuning
Code Code Available 25 AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution Apr 4, 2024 Image Super-Resolution Quantization
Code Code Available 25 LeanVec: Searching vectors faster by making them fit Dec 26, 2023 Cross-Modal Retrieval Dimensionality Reduction
Code Code Available 25 LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 25 KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches Jul 1, 2024 Book summarization Quantization
Code Code Available 25 Binarized Neural Machine Translation Feb 9, 2023 Binarization Machine Translation
Code Code Available 25 QQQ: Quality Quattuor-Bit Quantization for Large Language Models Jun 14, 2024 Quantization
Code Code Available 25 A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation Oct 2, 2024 Image Generation Quantization
Code Code Available 25 Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization Mar 19, 2024 Quantization
Code Code Available 25 Quantized symbolic time series approximation Nov 20, 2024 Anomaly Detection Astronomy
Code Code Available 25 QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos Dec 5, 2024 Attribute Quantization
Code Code Available 25 Binary Neural Networks: A Survey Mar 31, 2020 Binarization image-classification
Code Code Available 25 Dataset Quantization Aug 21, 2023 Dataset Distillation object-detection
Code Code Available 25 LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS Nov 28, 2023 Knowledge Distillation NeRF
Code Code Available 25 MAexp: A Generic Platform for RL-based Multi-Agent Exploration Apr 19, 2024 Diversity Multi-agent Reinforcement Learning
Code Code Available 25 OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models Aug 25, 2023 Common Sense Reasoning Computational Efficiency
Code Code Available 25 BitNet: Scaling 1-bit Transformers for Large Language Models Oct 17, 2023 Language Modeling Language Modelling
Code Code Available 25 Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 25 Q-VLM: Post-training Quantization for Large Vision-Language Models Oct 10, 2024 Language Modeling Language Modelling
Code Code Available 25 any4: Learned 4-bit Numeric Representation for LLMs Jul 7, 2025 GPU GSM8K
Code Code Available 25 Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Feb 16, 2024 Quantization
Code Code Available 25 An empirical study of LLaMA3 quantization: from LLMs to MLLMs Apr 22, 2024 Language Modelling Large Language Model
Code Code Available 25 hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices Mar 9, 2021 BIG-bench Machine Learning Diagnostic
Code Code Available 25 I-BERT: Integer-only BERT Quantization Jan 5, 2021 GPU Natural Language Inference
Code Code Available 25 INT-FlashAttention: Enabling Flash Attention for INT8 Quantization Sep 25, 2024 GPU Quantization
Code Code Available 25 Bolt: Accelerated Data Mining with Fast Vector Compression Jun 30, 2017 Quantization
Code Code Available 25 HAQ: Hardware-Aware Automated Quantization with Mixed Precision Nov 21, 2018 Quantization Reinforcement Learning
Code Code Available 25 Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Jun 17, 2024 image-classification Image Classification
Code Code Available 25 Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Mar 27, 2025 Image Generation Quantization
Code Code Available 25 An Empirical Study of Qwen3 Quantization May 4, 2025 Natural Language Understanding Quantization
Code Code Available 25 Similarity search in the blink of an eye with compressed indices Apr 7, 2023 Quantization
Code Code Available 25 SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Oct 17, 2024 Quantization
Code Code Available 25 Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 25 GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Apr 3, 2025 GPU Quantization
Code Code Available 25 GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval Jul 17, 2024 Decoder Image Enhancement
Code Code Available 25 SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models Aug 31, 2023 Decoder Language Modeling
Code Code Available 25 GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 25 I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference Jul 4, 2022 Quantization
Code Code Available 25 From Tiny Machine Learning to Tiny Deep Learning: A Survey Jun 21, 2025 AutoML Model Optimization
Code Code Available 25 Efficient Multi-Vector Dense Retrieval Using Bit Vectors Apr 3, 2024 Quantization Retrieval
Code Code Available 25 AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing Jun 23, 2025 Neural Architecture Search Quantization
Code Code Available 25 GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting Jan 26, 2025 Quantization
Code Code Available 25 CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization Nov 30, 2023 3DGS NeRF
Code Code Available 25 Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space May 19, 2025 Language Modeling Language Modelling
Code Code Available 25 Compact 3D Gaussian Representation for Radiance Field Nov 22, 2023 3DGS Model Compression
Code Code Available 25 FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching May 26, 2025 Quantization Speech Enhancement
Code Code Available 25 GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM Mar 8, 2024 Quantization
Code Code Available 25 Fast convolutional neural networks on FPGAs with hls4ml Jan 13, 2021 Model Compression Quantization
Code Code Available 25