SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Jan 30, 2025 Image Generation Model Compression
Code Code Available 95 GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Oct 31, 2022 GPU Language Modelling
Code Code Available 75 A Survey on Knowledge Distillation of Large Language Models Feb 20, 2024 Data Augmentation Knowledge Distillation
Code Code Available 55 LLM Inference Unveiled: Survey and Roofline Model Insights Feb 26, 2024 Knowledge Distillation Language Modelling
Code Code Available 45 ZipNN: Lossless Compression for AI Models Nov 7, 2024 Model Compression
Code Code Available 35 SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression Mar 12, 2024 Language Modeling Language Modelling
Code Code Available 35 ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models Aug 16, 2024 GPU Model Compression
Code Code Available 35 SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression Mar 16, 2025 Language Modeling Language Modelling
Code Code Available 35 Efficient Reasoning Models: A Survey Apr 15, 2025 Knowledge Distillation Model Compression
Code Code Available 35 Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields Aug 7, 2024 3DGS Model Compression
Code Code Available 35 Data-Free Knowledge Distillation for Deep Neural Networks Oct 19, 2017 Data-free Knowledge Distillation Knowledge Distillation
Code Code Available 25 QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning Feb 6, 2024 Image Generation Model Compression
Code Code Available 25 LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 25 Learning Student Networks in the Wild Jun 19, 2021 Knowledge Distillation Model Compression
Code Code Available 25 Compressing Volumetric Radiance Fields to 1 MB Nov 29, 2022 Model Compression NeRF
Code Code Available 25 Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design May 2, 2024 Model Compression Neural Network Compression
Code Code Available 25 On-Device Domain Generalization Sep 15, 2022 Data Augmentation Domain Generalization
Code Code Available 25 PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning Feb 27, 2024 Knowledge Distillation Model Compression
Code Code Available 25 MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Sep 26, 2024 Large Language Model Model Compression
Code Code Available 25 Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers Jun 25, 2024 Image Generation Model Compression
Code Code Available 25 Compact 3D Gaussian Representation for Radiance Field Nov 22, 2023 3DGS Model Compression
Code Code Available 25 Well-Read Students Learn Better: On the Importance of Pre-training Compact Models Aug 23, 2019 Knowledge Distillation Language Modelling
Code Code Available 25 LightGNN: Simple Graph Neural Network for Recommendation Jan 6, 2025 Computational Efficiency Graph Neural Network
Code Code Available 25 MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Jun 21, 2024 GPU Language Modeling
Code Code Available 25 Fast convolutional neural networks on FPGAs with hls4ml Jan 13, 2021 Model Compression Quantization
Code Code Available 25 Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey Aug 18, 2023 Deblurring Image Restoration
Code Code Available 25 Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks Apr 13, 2020 Knowledge Distillation Model Compression
Code Code Available 25 AMC: AutoML for Model Compression and Acceleration on Mobile Devices Feb 10, 2018 AutoML GPU
Code Code Available 25 OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models Aug 25, 2023 Common Sense Reasoning Computational Efficiency
Code Code Available 25 Towards Lightweight Super-Resolution with Dual Regression Learning Jul 16, 2022 Image Super-Resolution Model Compression
Code Code Available 25 A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness Jun 16, 2021 Data Augmentation Model Compression
Code Code Available 15 3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching Jul 6, 2022 Homography Estimation Model Compression
Code Code Available 15 CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Jul 4, 2022 Compiler Optimization image-classification
Code Code Available 15 A Unified Pruning Framework for Vision Transformers Nov 30, 2021 Model Compression object-detection
Code Code Available 15 Backdoor Attacks on Federated Learning with Lottery Ticket Hypothesis Sep 22, 2021 Backdoor Attack Federated Learning
Code Code Available 15 Contrastive Representation Distillation Oct 23, 2019 Contrastive Learning Knowledge Distillation
Code Code Available 15 CrossKD: Cross-Head Knowledge Distillation for Object Detection Jun 20, 2023 Dense Object Detection Knowledge Distillation
Code Code Available 15 Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference Jun 26, 2023 CPU Model Compression
Code Code Available 15 Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian Splatting May 15, 2025 3DGS Model Compression
Code Code Available 15 Designing Large Foundation Models for Efficient Training and Inference: A Survey Sep 3, 2024 Knowledge Distillation Model Compression
Code Code Available 15 CompRess: Self-Supervised Learning by Compressing Representations Oct 28, 2020 Linear evaluation Model Compression
Code Code Available 15 Compression-Aware Video Super-Resolution Jan 1, 2023 Model Compression Super-Resolution
Code Code Available 15 Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup Dec 17, 2020 Informativeness Knowledge Distillation
Code Code Available 15 Contrastive Distillation on Intermediate Representations for Language Model Compression Sep 29, 2020 Knowledge Distillation Language Modeling
Code Code Available 15 DarwinLM: Evolutionary Structured Pruning of Large Language Models Feb 11, 2025 Model Compression
Code Code Available 15 Compacting, Picking and Growing for Unforgetting Continual Learning Oct 15, 2019 Age And Gender Classification Continual Learning
Code Code Available 15 A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion Jan 13, 2025 Dynamic neural networks Model Compression
Code Code Available 15 Basic Binary Convolution Unit for Binarized Image Restoration Network Oct 2, 2022 Binarization Image Restoration
Code Code Available 15 Composable Interventions for Language Models Jul 9, 2024 knowledge editing Machine Unlearning
Code Code Available 15 An Information Theory-inspired Strategy for Automatic Network Pruning Aug 19, 2021 AutoML Model Compression
Code Code Available 15