SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Jan 30, 2025 Image Generation Model Compression
Code Code Available 9GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Oct 31, 2022 GPU Language Modelling
Code Code Available 7A Survey on Knowledge Distillation of Large Language Models Feb 20, 2024 Data Augmentation Knowledge Distillation
Code Code Available 5LLM Inference Unveiled: Survey and Roofline Model Insights Feb 26, 2024 Knowledge Distillation Language Modelling
Code Code Available 4Efficient Reasoning Models: A Survey Apr 15, 2025 Knowledge Distillation Model Compression
Code Code Available 3SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression Mar 16, 2025 Language Modeling Language Modelling
Code Code Available 3ZipNN: Lossless Compression for AI Models Nov 7, 2024 Model Compression
Code Code Available 3ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models Aug 16, 2024 GPU Model Compression
Code Code Available 3Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields Aug 7, 2024 3DGS Model Compression
Code Code Available 3SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression Mar 12, 2024 Language Modeling Language Modelling
Code Code Available 3LightGNN: Simple Graph Neural Network for Recommendation Jan 6, 2025 Computational Efficiency Graph Neural Network
Code Code Available 2MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Sep 26, 2024 Large Language Model Model Compression
Code Code Available 2Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers Jun 25, 2024 Image Generation Model Compression
Code Code Available 2MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Jun 21, 2024 GPU Language Modeling
Code Code Available 2Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design May 2, 2024 Model Compression Neural Network Compression
Code Code Available 2PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning Feb 27, 2024 Knowledge Distillation Model Compression
Code Code Available 2QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning Feb 6, 2024 Image Generation Model Compression
Code Code Available 2LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 2Compact 3D Gaussian Representation for Radiance Field Nov 22, 2023 3DGS Model Compression
Code Code Available 2OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models Aug 25, 2023 Common Sense Reasoning Computational Efficiency
Code Code Available 2Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey Aug 18, 2023 Deblurring Image Restoration
Code Code Available 2Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 2On-Device Domain Generalization Sep 15, 2022 Data Augmentation Domain Generalization
Code Code Available 2Towards Lightweight Super-Resolution with Dual Regression Learning Jul 16, 2022 Image Super-Resolution Model Compression
Code Code Available 2Learning Student Networks in the Wild Jun 19, 2021 Knowledge Distillation Model Compression
Code Code Available 2Fast convolutional neural networks on FPGAs with hls4ml Jan 13, 2021 Model Compression Quantization
Code Code Available 2Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks Apr 13, 2020 Knowledge Distillation Model Compression
Code Code Available 2Well-Read Students Learn Better: On the Importance of Pre-training Compact Models Aug 23, 2019 Knowledge Distillation Language Modelling
Code Code Available 2AMC: AutoML for Model Compression and Acceleration on Mobile Devices Feb 10, 2018 AutoML GPU
Code Code Available 2Data-Free Knowledge Distillation for Deep Neural Networks Oct 19, 2017 Data-free Knowledge Distillation Knowledge Distillation
Code Code Available 2FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation Jun 13, 2025 Model Compression Quantization
Code Code Available 1Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian Splatting May 15, 2025 3DGS Model Compression
Code Code Available 1Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping Feb 16, 2025 Code Generation Instruction Following
Code Code Available 1Forget the Data and Fine-Tuning! Just Fold the Network to Compress Feb 14, 2025 Model Compression
Code Code Available 1DarwinLM: Evolutionary Structured Pruning of Large Language Models Feb 11, 2025 Model Compression
Code Code Available 1Activation-Informed Merging of Large Language Models Feb 4, 2025 Computational Efficiency Continual Learning
Code Code Available 1A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion Jan 13, 2025 Dynamic neural networks Model Compression
Code Code Available 1Merging Feed-Forward Sublayers for Compressed Transformers Jan 10, 2025 image-classification Image Classification
Code Code Available 1CoA: Towards Real Image Dehazing via Compression-and-Adaptation Jan 1, 2025 Image Dehazing Model Compression
Code Code Available 1Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Dec 18, 2024 Model Compression
Code Code Available 1LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment Oct 28, 2024 Benchmarking Language Modeling
Code Code Available 1EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search Oct 18, 2024 Model Compression Quantization
Code Code Available 1SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression Oct 12, 2024 Model Compression Natural Language Understanding
Code Code Available 1QT-DoG: Quantization-aware Training for Domain Generalization Oct 8, 2024 Domain Generalization Model Compression
Code Code Available 1Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression Oct 2, 2024 Language Modeling Language Modelling
Code Code Available 1Search for Efficient Large Language Models Sep 25, 2024 GPU Model Compression
Code Code Available 1Designing Large Foundation Models for Efficient Training and Inference: A Survey Sep 3, 2024 Knowledge Distillation Model Compression
Code Code Available 1Hyper-Compression: Model Compression via Hyperfunction Sep 1, 2024 model Model Compression
Code Code Available 1Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic Aug 24, 2024 Model Compression Task Arithmetic
Code Code Available 1Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and Transformers Aug 22, 2024 Model Compression
Code Code Available 1