Compressed models are NOT miniature versions of large models Jul 18, 2024 Adversarial Attack Model Compression
— Unverified 0Mamba-PTQ: Outlier Channels in Recurrent Large Language Models Jul 17, 2024 Mamba Model Compression
— Unverified 0Minimizing PLM-Based Few-Shot Intent Detectors Jul 13, 2024 Data Augmentation Knowledge Distillation
Code Code Available 0Inference Optimization of Foundation Models on AI Accelerators Jul 12, 2024 Inference Optimization Model Compression
— Unverified 0Explicit-NeRF-QA: A Quality Assessment Database for Explicit NeRF Model Compression Jul 11, 2024 Model Compression NeRF
Code Code Available 0Composable Interventions for Language Models Jul 9, 2024 knowledge editing Machine Unlearning
Code Code Available 1Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression Jul 6, 2024 Language Modeling Language Modelling
Code Code Available 0Quantizing YOLOv7: A Comprehensive Study Jul 6, 2024 Model Compression object-detection
— Unverified 0AMD: Automatic Multi-step Distillation of Large-scale Vision Models Jul 5, 2024 image-classification Image Classification
— Unverified 0The Impact of Quantization and Pruning on Deep Reinforcement Learning Models Jul 5, 2024 Deep Reinforcement Learning Model Compression
— Unverified 0MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models Jul 3, 2024 Extractive Question-Answering Knowledge Distillation
— Unverified 0Efficient DNN-Powered Software with Fair Sparse Models Jul 3, 2024 Fairness Model Compression
— Unverified 0FoldGPT: Simple and Effective Large Language Model Compression Scheme Jul 1, 2024 Language Modeling Language Modelling
— Unverified 0MCNC: Manifold Constrained Network Compression Jun 27, 2024 Model Compression Quantization
— Unverified 0Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers Jun 25, 2024 Image Generation Model Compression
Code Code Available 2LiteYOLO-ID: A Lightweight Object Detection Network for Insulator Defect Detection Jun 24, 2024 Defect Detection Insulator Defect Detection
Code Code Available 1Exploring compressibility of transformer based text-to-music (TTM) models Jun 24, 2024 Decoder FAD
— Unverified 0Speeding Up Image Classifiers with Little Companions Jun 24, 2024 image-classification Image Classification
— Unverified 0Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging Jun 24, 2024 MMLU Model Compression
Code Code Available 1Reinforced Knowledge Distillation for Time Series Regression Jun 21, 2024 Knowledge Distillation Model Compression
Code Code Available 0MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression Jun 21, 2024 GPU Language Modeling
Code Code Available 2FLoCoRA: Federated learning compression with low-rank adaptation Jun 20, 2024 Federated Learning Model Compression
Code Code Available 0Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices Jun 20, 2024 Knowledge Distillation Model Compression
— Unverified 0SDQ: Sparse Decomposed Quantization for LLM Inference Jun 19, 2024 Model Compression Quantization
— Unverified 0Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model Jun 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead Jun 17, 2024 GPU Model Compression
— Unverified 0An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers Jun 17, 2024 Model Compression text-classification
— Unverified 0Model Adaptation for Time Constrained Embodied Control Jun 17, 2024 Autonomous Driving Decision Making
— Unverified 0Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutions Jun 16, 2024 Federated Learning Knowledge Distillation
— Unverified 0Implicit Neural Representation for Videos Based on Residual Connection Jun 15, 2024 Image Reconstruction Model Compression
— Unverified 0EncCluster: Scalable Functional Encryption in Federated Learning through Weight Clustering and Probabilistic Filters Jun 13, 2024 Federated Learning Model Compression
— Unverified 0PC-LoRA: Low-Rank Adaptation for Progressive Model Compression with Knowledge Distillation Jun 13, 2024 Knowledge Distillation Model Compression
— Unverified 0MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Jun 12, 2024 Benchmarking Model Compression
— Unverified 0DistilDoc: Knowledge Distillation for Visually-Rich Document Applications Jun 12, 2024 document-image-classification Document Image Classification
— Unverified 0Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark Jun 12, 2024 Benchmarking Mixture-of-Experts
Code Code Available 1On the social bias of speech self-supervised models Jun 7, 2024 Model Compression Self-Supervised Learning
— Unverified 0Slicing Mutual Information Generalization Bounds for Neural Networks Jun 6, 2024 Generalization Bounds Model Compression
Code Code Available 0Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective Jun 6, 2024 Generalization Bounds In-Context Learning
Code Code Available 0Reweighted Solutions for Weighted Low Rank Approximation Jun 4, 2024 feature selection Model Compression
— Unverified 0Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning Jun 3, 2024 Model Compression Network Pruning
— Unverified 0Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model Jun 1, 2024 Knowledge Distillation Model Compression
Code Code Available 0LCQ: Low-Rank Codebook based Quantization for Large Language Models May 31, 2024 Model Compression Quantization
— Unverified 0Effective Interplay between Sparsity and Quantization: From Theory to Practice May 31, 2024 Computational Efficiency Model Compression
— Unverified 0Occam Gradient Descent May 30, 2024 image-classification Image Classification
Code Code Available 0Dual sparse training framework: inducing activation map sparsity via Transformed 1 regularization May 30, 2024 Model Compression
— Unverified 0subMFL: Compatiple subModel Generation for Federated Learning in Device Heterogenous Environment May 30, 2024 Federated Learning Model Compression
Code Code Available 0ExtremeMETA: High-speed Lightweight Image Segmentation Model by Remodeling Multi-channel Metamaterial Imagers May 27, 2024 Image Segmentation Model Compression
— Unverified 0Efficient Model Compression for Hierarchical Federated Learning May 27, 2024 Edge-computing Federated Learning
— Unverified 0NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models May 27, 2024 Information Retrieval Language Modelling
— Unverified 0TinyM^2Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment May 20, 2024 Knowledge Distillation Model Compression
— Unverified 0