PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Apr 3, 2024 GSM8K Quantization
Code Code Available 3The Unreasonable Ineffectiveness of the Deeper Layers Mar 26, 2024 GPU Quantization
Code Code Available 3HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression Mar 21, 2024 3DGS Attribute
Code Code Available 3GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting Mar 13, 2024 GPU Quantization
Code Code Available 3Behavior Generation with Latent Actions Mar 5, 2024 Autonomous Driving Decision Making
Code Code Available 3NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Mar 5, 2024 Quantization Speech Synthesis
Code Code Available 3IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact Mar 2, 2024 Language Modeling Language Modelling
Code Code Available 3Language-Codec: Bridging Discrete Codec Representations and Speech Language Models Feb 19, 2024 Audio Compression Audio Generation
Code Code Available 3OneBit: Towards Extremely Low-bit Large Language Models Feb 17, 2024 Quantization
Code Code Available 3BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Feb 6, 2024 Binarization GPU
Code Code Available 3KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache Feb 5, 2024 Quantization
Code Code Available 3KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization Jan 31, 2024 GPU Quantization
Code Code Available 3FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Jan 25, 2024 GPU Quantization
Code Code Available 3Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models Jan 16, 2024 GPU Quantization
Code Code Available 3RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation Jan 9, 2024 GPU Math
Code Code Available 3TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones Dec 28, 2023 Computational Efficiency Image Captioning
Code Code Available 3Compact 3D Scene Representation via Self-Organizing Gaussian Grids Dec 19, 2023 3DGS
Code Code Available 3MotionGPT: Human Motion as a Foreign Language Jun 26, 2023 Language Modeling Language Modelling
Code Code Available 3High-Fidelity Audio Compression with Improved RVQGAN Jun 11, 2023 Audio Compression Audio Generation
Code Code Available 3LLM-QAT: Data-Free Quantization Aware Training for Large Language Models May 29, 2023 Data Free Quantization Quantization
Code Code Available 3Autoregressive Image Generation using Residual Quantization Mar 3, 2022 Conditional Image Generation Image Generation
Code Code Available 38-bit Optimizers via Block-wise Quantization Oct 6, 2021 Language Modeling Language Modelling
Code Code Available 3wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Jun 20, 2020 Quantization Self-Supervised Learning
Code Code Available 3MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization Jul 14, 2025 2k Image Generation
Code Code Available 2MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization Jul 10, 2025 2k Quantization
Code Code Available 2any4: Learned 4-bit Numeric Representation for LLMs Jul 7, 2025 GPU GSM8K
Code Code Available 2AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing Jun 23, 2025 Neural Architecture Search Quantization
Code Code Available 2From Tiny Machine Learning to Tiny Deep Learning: A Survey Jun 21, 2025 AutoML Model Optimization
Code Code Available 2DETRPose: Real-time end-to-end transformer model for multi-person pose estimation Jun 16, 2025 2D Pose Estimation Decoder
Code Code Available 2BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation Jun 9, 2025 Quantization Vision-Language-Action
Code Code Available 2RecGPT: A Foundation Model for Sequential Recommendation Jun 6, 2025 Decoder model
Code Code Available 2Model-Preserving Adaptive Rounding May 29, 2025 model Quantization
Code Code Available 2FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching May 26, 2025 Quantization Speech Enhancement
Code Code Available 2Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space May 19, 2025 Language Modeling Language Modelling
Code Code Available 2GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 2Diffusion Model Quantization: A Review May 8, 2025 model Quantization
Code Code Available 2An Empirical Study of Qwen3 Quantization May 4, 2025 Natural Language Understanding Quantization
Code Code Available 2Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Apr 29, 2025 Quantization
Code Code Available 2Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models Apr 15, 2025 Autonomous Driving Computational Efficiency
Code Code Available 2Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models Apr 7, 2025 Math Quantization
Code Code Available 2GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Apr 3, 2025 GPU Quantization
Code Code Available 2Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models Mar 28, 2025 MMLU Quantization
Code Code Available 2Harmonizing Visual Representations for Unified Multimodal Understanding and Generation Mar 27, 2025 Image Generation Quantization
Code Code Available 2GENIUS: A Generative Framework for Universal Multimodal Search Mar 25, 2025 Information Retrieval Quantization
Code Code Available 2BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache Mar 24, 2025 Computational Efficiency GPU
Code Code Available 2D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS Mar 7, 2025 Denoising Quantization
Code Code Available 2BHViT: Binarized Hybrid Vision Transformer Mar 4, 2025 Binarization Quantization
Code Code Available 2Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Feb 19, 2025 GPU Quantization
Code Code Available 2QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Feb 7, 2025 GPU Quantization
Code Code Available 2Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding Feb 3, 2025 Quantization
Code Code Available 2