Knowledge Distillation with Refined Logits Aug 14, 2024 Knowledge Distillation Model Compression
Code Code Available 1Composable Interventions for Language Models Jul 9, 2024 knowledge editing Machine Unlearning
Code Code Available 1Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging Jun 24, 2024 MMLU Model Compression
Code Code Available 1LiteYOLO-ID: A Lightweight Object Detection Network for Insulator Defect Detection Jun 24, 2024 Defect Detection Insulator Defect Detection
Code Code Available 1Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark Jun 12, 2024 Benchmarking Mixture-of-Experts
Code Code Available 1Transferable and Principled Efficiency for Open-Vocabulary Segmentation Apr 11, 2024 Model Compression
Code Code Available 1Streamlining Redundant Layers to Compress Large Language Models Mar 28, 2024 Model Compression
Code Code Available 1PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation Mar 14, 2024 Model Compression parameter-efficient fine-tuning
Code Code Available 1Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing Mar 10, 2024 Image Retrieval Knowledge Distillation
Code Code Available 1"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach Mar 1, 2024 Model Compression Quantization
Code Code Available 1PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning Feb 20, 2024 Instruction Following Knowledge Distillation
Code Code Available 1Fast Vocabulary Transfer for Language Model Compression Feb 15, 2024 Language Modeling Language Modelling
Code Code Available 1Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward Feb 2, 2024 Model Compression Survey
Code Code Available 1Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation Jan 25, 2024 Clustering Federated Learning
Code Code Available 1Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices Jan 17, 2024 Dynamic neural networks GPU
Code Code Available 1Retraining-free Model Quantization via One-Shot Weight-Coupling Learning Jan 3, 2024 Model Compression Quantization
Code Code Available 1Generative Model-based Feature Knowledge Distillation for Action Recognition Dec 14, 2023 Action Detection Action Recognition
Code Code Available 1Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Dec 12, 2023 GPU Model Compression
Code Code Available 1LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning Nov 20, 2023 GPU Language Modeling
Code Code Available 1An Empirical Study of CLIP for Text-based Person Search Aug 19, 2023 Cross-Modal Retrieval Data Augmentation
Code Code Available 1Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models Aug 7, 2023 Language Modeling Language Modelling
Code Code Available 1Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision Jul 1, 2023 Knowledge Distillation Model Compression
Code Code Available 1Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference Jun 26, 2023 CPU Model Compression
Code Code Available 1CrossKD: Cross-Head Knowledge Distillation for Object Detection Jun 20, 2023 Dense Object Detection Knowledge Distillation
Code Code Available 1HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation Jun 16, 2023 Model Compression Quantization
Code Code Available 1Efficient and Robust Quantization-aware Training via Adaptive Coreset Selection Jun 12, 2023 Model Compression Quantization
Code Code Available 1MobileNMT: Enabling Translation in 15MB and 30ms Jun 7, 2023 Model Compression NMT
Code Code Available 1LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning May 28, 2023 Model Compression Network Pruning
Code Code Available 1COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models May 26, 2023 Model Compression
Code Code Available 1An Efficient Multilingual Language Model Compression through Vocabulary Trimming May 24, 2023 Language Modeling Language Modelling
Code Code Available 1AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression May 17, 2023 Knowledge Distillation Language Modeling
Code Code Available 1Class Attention Transfer Based Knowledge Distillation Apr 25, 2023 Knowledge Distillation Model Compression
Code Code Available 1Performance-aware Approximation of Global Channel Pruning for Multitask CNNs Mar 21, 2023 Model Compression
Code Code Available 1The Tiny Time-series Transformer: Low-latency High-throughput Classification of Astronomical Transients using Deep Model Compression Mar 15, 2023 Astronomy Model Compression
Code Code Available 1Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding Feb 27, 2023 Model Compression Representation Learning
Code Code Available 1Dual Relation Knowledge Distillation for Object Detection Feb 11, 2023 Knowledge Distillation Model Compression
Code Code Available 1UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers Jan 31, 2023 Image Captioning Image Classification
Code Code Available 1Compression-Aware Video Super-Resolution Jan 1, 2023 Model Compression Super-Resolution
Code Code Available 1FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos Dec 23, 2022 Model Compression Quantization
Code Code Available 1RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers Dec 16, 2022 Model Compression Quantization
Code Code Available 1FedUKD: Federated UNet Model with Knowledge Distillation for Land Use Classification from Satellite and Street Views Dec 5, 2022 Knowledge Distillation Model Compression
Code Code Available 1Discovering Dynamic Patterns from Spatiotemporal Data with Time-Varying Low-Rank Autoregression Nov 28, 2022 Model Compression
Code Code Available 1Unbiased Knowledge Distillation for Recommendation Nov 27, 2022 Knowledge Distillation Model Compression
Code Code Available 1Sparse Probabilistic Circuits via Pruning and Growing Nov 22, 2022 Model Compression
Code Code Available 1Parameter-Efficient Masking Networks Oct 13, 2022 Model Compression
Code Code Available 1Less is More: Task-aware Layer-wise Distillation for Language Model Compression Oct 4, 2022 Language Modeling Language Modelling
Code Code Available 1Basic Binary Convolution Unit for Binarized Image Restoration Network Oct 2, 2022 Binarization Image Restoration
Code Code Available 1Efficient On-Device Session-Based Recommendation Sep 27, 2022 Knowledge Distillation Model Compression
Code Code Available 1PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers Sep 13, 2022 Data Free Quantization image-classification
Code Code Available 1DUET: A Tuning-Free Device-Cloud Collaborative Parameters Generation Framework for Efficient Device Model Generalization Sep 12, 2022 Device-Cloud Collaboration Domain Adaptation
Code Code Available 1