Efficient Post-training Quantization with FP8 Formats Sep 26, 2023 image-classification Image Classification
Code Code Available 45 Polysemous codes Sep 7, 2016 Quantization
Code Code Available 45 Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs Sep 11, 2023 Quantization
Code Code Available 45 UniTok: A Unified Tokenizer for Visual Generation and Understanding Feb 27, 2025 Quantization
Code Code Available 45 Large Language Models for Time Series: A Survey Feb 2, 2024 Quantization Survey
Code Code Available 45 T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Jun 25, 2024 Computational Efficiency CPU
Code Code Available 45 The case for 4-bit precision: k-bit Inference Scaling Laws Dec 19, 2022 Quantization
Code Code Available 45 Link and code: Fast indexing with graphs and compact regression codes Apr 26, 2018 Image Similarity Search Quantization
Code Code Available 45 The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Mar 14, 2022 CPU Quantization
Code Code Available 45 SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Nov 7, 2024 GPU Quantization
Code Code Available 45 BitNet a4.8: 4-bit Activations for 1-bit LLMs Nov 7, 2024 Quantization
Code Code Available 45 SNAC: Multi-Scale Neural Audio Codec Oct 18, 2024 Audio Compression Audio Generation
Code Code Available 45 BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation Feb 16, 2024 Knowledge Distillation Quantization
Code Code Available 45 LLM Inference Unveiled: Survey and Roofline Model Insights Feb 26, 2024 Knowledge Distillation Language Modelling
Code Code Available 45 Taming Scalable Visual Tokenizer for Autoregressive Image Generation Dec 3, 2024 Image Generation Image Reconstruction
Code Code Available 45 VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models Sep 25, 2024 Quantization
Code Code Available 45 BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec Sep 9, 2024 Quantization
Code Code Available 35 HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression Mar 21, 2024 3DGS Attribute
Code Code Available 35 Scaling Transformers for Low-Bitrate High-Quality Speech Coding Nov 29, 2024 Quantization
Code Code Available 35 RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation Jan 9, 2024 GPU Math
Code Code Available 35 HAC++: Towards 100X Compression of 3D Gaussian Splatting Jan 21, 2025 3DGS Attribute
Code Code Available 35 FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Jan 25, 2024 GPU Quantization
Code Code Available 35 Behavior Generation with Latent Actions Mar 5, 2024 Autonomous Driving Decision Making
Code Code Available 35 GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting Mar 13, 2024 GPU Quantization
Code Code Available 35 Fast Matrix Multiplications for Lookup Table-Quantized LLMs Jul 15, 2024 Quantization
Code Code Available 35 BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Feb 6, 2024 Binarization GPU
Code Code Available 35 FlatQuant: Flatness Matters for LLM Quantization Oct 12, 2024 Quantization
Code Code Available 35 Autoregressive Image Generation using Residual Quantization Mar 3, 2022 Conditional Image Generation Image Generation
Code Code Available 35 EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Jul 10, 2024 GPU Quantization
Code Code Available 35 ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization Feb 4, 2025 Quantization
Code Code Available 35 OneBit: Towards Extremely Low-bit Large Language Models Feb 17, 2024 Quantization
Code Code Available 35 DPLM-2: A Multimodal Diffusion Protein Language Model Oct 17, 2024 Language Modeling Language Modelling
Code Code Available 35 PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Apr 3, 2024 GSM8K Quantization
Code Code Available 35 MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts Apr 22, 2024 Common Sense Reasoning GPU
Code Code Available 35 ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models Aug 16, 2024 GPU Model Compression
Code Code Available 35 MotionGPT: Human Motion as a Foreign Language Jun 26, 2023 Language Modeling Language Modelling
Code Code Available 35 CV-VAE: A Compatible Video VAE for Latent Generative Video Models May 30, 2024 Quantization
Code Code Available 35 Ditto: Quantization-aware Secure Inference of Transformers upon MPC May 9, 2024 Quantization
Code Code Available 35 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Mar 5, 2024 Quantization Speech Synthesis
Code Code Available 35 A Survey on Large Language Model Acceleration based on KV Cache Management Dec 27, 2024 Language Modeling Language Modelling
Code Code Available 35 Addressing Representation Collapse in Vector Quantized Models with One Linear Layer Nov 4, 2024 Quantization Representation Learning
Code Code Available 35 A Survey on Inference Optimization Techniques for Mixture of Experts Models Dec 18, 2024 Computational Efficiency Distributed Computing
Code Code Available 35 Data Generation for Hardware-Friendly Post-Training Quantization Oct 29, 2024 Data Augmentation GPU
Code Code Available 35 LLM-QAT: Data-Free Quantization Aware Training for Large Language Models May 29, 2023 Data Free Quantization Quantization
Code Code Available 35 MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization Jan 2, 2025 Contrastive Learning Key Detection
Code Code Available 35 Compact 3D Scene Representation via Self-Organizing Gaussian Grids Dec 19, 2023 3DGS
Code Code Available 35 Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields Aug 7, 2024 3DGS Model Compression
Code Code Available 35 Latent Action Pretraining from Videos Oct 15, 2024 Quantization Robot Manipulation
Code Code Available 35 Language-Codec: Bridging Discrete Codec Representations and Speech Language Models Feb 19, 2024 Audio Compression Audio Generation
Code Code Available 35 Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model Aug 30, 2024 Audio Compression Audio Generation
Code Code Available 35