MAUVE Scores for Generative Models: Theory and Practice Dec 30, 2022 Quantization
Code Code Available 2Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 2A Closer Look at Hardware-Friendly Weight Quantization Oct 7, 2022 Quantization
Code Code Available 2I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference Jul 4, 2022 Quantization
Code Code Available 2On-Device Training Under 256KB Memory Jun 30, 2022 Lifelong learning Quantization
Code Code Available 2ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers Jun 4, 2022 Knowledge Distillation Quantization
Code Code Available 2Re-parameterizing Your Optimizers rather than Architectures May 30, 2022 Quantization
Code Code Available 2VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder May 13, 2022 Blind Face Restoration Decoder
Code Code Available 2BMInf: An Efficient Toolkit for Big Model Inference and Tuning May 1, 2022 CPU GPU
Code Code Available 24-bit Conformer with Native Quantization Aware Training for Speech Recognition Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization Mar 11, 2022 image-classification Image Classification
Code Code Available 2QuantumNAT: Quantum Noise-Aware Training with Noise Injection, Quantization and Normalization Oct 21, 2021 Denoising Quantization
Code Code Available 2hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices Mar 9, 2021 BIG-bench Machine Learning Diagnostic
Code Code Available 2FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference Jan 13, 2021 Code Generation Deep Learning
Code Code Available 2Fast convolutional neural networks on FPGAs with hls4ml Jan 13, 2021 Model Compression Quantization
Code Code Available 2I-BERT: Integer-only BERT Quantization Jan 5, 2021 GPU Natural Language Inference
Code Code Available 2Binary Neural Networks: A Survey Mar 31, 2020 Binarization image-classification
Code Code Available 2Neural Network Compression Framework for fast model inference Feb 20, 2020 Binarization CPU
Code Code Available 2HAQ: Hardware-Aware Automated Quantization with Mixed Precision Nov 21, 2018 Quantization Reinforcement Learning
Code Code Available 2Bolt: Accelerated Data Mining with Fast Vector Compression Jun 30, 2017 Quantization
Code Code Available 2Compress Any Segment Anything Model (SAM) Jul 11, 2025 model Quantization
Code Code Available 1CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation Jun 29, 2025 Image Generation Image-to-Image Translation
Code Code Available 1Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models Jun 25, 2025 Quantization
Code Code Available 1CommVQ: Commutative Vector Quantization for KV Cache Compression Jun 23, 2025 GPU GSM8K
Code Code Available 1FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation Jun 13, 2025 Model Compression Quantization
Code Code Available 1Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design May 28, 2025 GPU Quantization
Code Code Available 1TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization May 26, 2025 CPU GPU
Code Code Available 1Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression May 26, 2025 Language Modeling Language Modelling
Code Code Available 1FP4 All the Way: Fully Quantized Training of LLMs May 25, 2025 All Quantization
Code Code Available 1Mind the Gap: A Practical Attack on GGUF Quantization May 24, 2025 Code Generation Quantization
Code Code Available 1PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs May 24, 2025 Quantization
Code Code Available 1DVD-Quant: Data-free Video Diffusion Transformers Quantization May 24, 2025 Data Free Quantization Quantization
Code Code Available 1UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information May 23, 2025 Large Language Model Quantization
Code Code Available 1Optimizing Binary and Ternary Neural Network Inference on RRAM Crossbars using CIM-Explorer May 20, 2025 Quantization
Code Code Available 1Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis May 20, 2025 GPU parameter-efficient fine-tuning
Code Code Available 1Fine-tuning Quantized Neural Networks with Zeroth-order Optimization May 19, 2025 GPU Quantization
Code Code Available 1GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models May 16, 2025 Adversarial Attack Adversarial Defense
Code Code Available 1Accurate KV Cache Quantization with Outlier Tokens Tracing May 16, 2025 Quantization
Code Code Available 1EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes May 16, 2025 3DGS NeRF
Code Code Available 1Analog Foundation Models May 14, 2025 4k Quantization
Code Code Available 1Continuous Visual Autoregressive Generation via Score Maximization May 12, 2025 Quantization
Code Code Available 1MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design May 9, 2025 Mixture-of-Experts Quantization
Code Code Available 1RGB-Event Fusion with Self-Attention for Collision Prediction May 7, 2025 Benchmarking Computational Efficiency
Code Code Available 1Fast and Low-Cost Genomic Foundation Models via Outlier Removal May 1, 2025 Adversarial Attack Adversarial Robustness
Code Code Available 1NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models Apr 20, 2025 Quantization
Code Code Available 1Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection Apr 17, 2025 Link Prediction Node Classification
Code Code Available 1Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression Apr 10, 2025 Math MMLU
Code Code Available 1APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers Apr 3, 2025 Quantization
Code Code Available 1MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators Apr 3, 2025 Mixture-of-Experts Quantization
Code Code Available 1MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Apr 1, 2025 Image Generation Image Reconstruction
Code Code Available 1