GQ-Net: Training Quantization-Friendly Deep Networks Sep 25, 2019 Model Compression Quantization
— Unverified 0Cosine Similarity Knowledge Distillation for Individual Class Information Transfer Nov 24, 2023 Knowledge Distillation Model Compression
— Unverified 0CORSD: Class-Oriented Relational Self Distillation Apr 28, 2023 Knowledge Distillation Model Compression
— Unverified 0GlueFL: Reconciling Client Sampling and Model Masking for Bandwidth Efficient Federated Learning Dec 3, 2022 Federated Learning Model Compression
— Unverified 0A Theoretical Understanding of Neural Network Compression from Sparse Linear Approximation Jun 11, 2022 Model Compression Neural Network Compression
— Unverified 0A Half-Space Stochastic Projected Gradient Method for Group Sparsity Regularization Jan 1, 2021 compressed sensing feature selection
— Unverified 0Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference Feb 2, 2025 Model Compression Quantization
— Unverified 0Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning Jan 15, 2024 Model Compression Neural Network Compression
— Unverified 0Aggressive Post-Training Compression on Extremely Large Language Models Sep 30, 2024 Model Compression Network Pruning
— Unverified 0Supervised domain adaptation for building extraction from off-nadir aerial images Nov 7, 2023 Domain Adaptation Earth Observation
— Unverified 0Generalized Uncertainty of Deep Neural Networks: Taxonomy and Applications Feb 2, 2023 Knowledge Distillation Model Compression
— Unverified 0General Compression Framework for Efficient Transformer Object Tracking Sep 26, 2024 Model Compression Object
— Unverified 0A Survey on Transformer Compression Feb 5, 2024 Knowledge Distillation Mamba
— Unverified 0How to Explain Neural Networks: an Approximation Perspective May 17, 2021 Model Compression
— Unverified 0Continuous Approximations for Improving Quantization Aware Training of LLMs Oct 6, 2024 MMLU Model Compression
— Unverified 0Context-aware deep model compression for edge cloud computing Nov 29, 2020 Cloud Computing Image Classification
— Unverified 0A Survey on Model Compression and Acceleration for Pretrained Language Models Feb 15, 2022 Model Compression
— Unverified 0A Survey on Model Compression for Large Language Models Aug 15, 2023 Benchmarking Knowledge Distillation
— Unverified 0Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision Aug 30, 2021 Deep Learning Model Compression
— Unverified 0FTRANS: Energy-Efficient Acceleration of Transformers using FPGA Jul 16, 2020 CPU GPU
— Unverified 0AfroXLMR-Comet: Multilingual Knowledge Distillation with Attention Matching for Low-Resource languages Feb 25, 2025 Knowledge Distillation Language Modeling
— Unverified 0NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration Dec 1, 2020 Bayesian Optimization Code Generation
— Unverified 0How to Select One Among All ? An Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding Nov 1, 2021 Adversarial Robustness All
— Unverified 0ICD-Face: Intra-class Compactness Distillation for Face Recognition Jan 1, 2023 Face Recognition Knowledge Distillation
— Unverified 0FSCNN: A Fast Sparse Convolution Neural Network Inference System Dec 17, 2022 Model Compression
— Unverified 0Frustratingly Easy Model Ensemble for Abstractive Summarization Oct 1, 2018 Abstractive Text Summarization Density Estimation
— Unverified 0From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models Nov 6, 2024 Model Compression Sentence
— Unverified 0Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach Jun 28, 2022 Model Compression Quantization
— Unverified 0From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs Apr 18, 2025 Knowledge Distillation Model Compression
— Unverified 0GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization Sep 6, 2021 channel selection Model Compression
— Unverified 0GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning Oct 2, 2020 Deep Learning Model Compression
— Unverified 0GeneCAI: Genetic Evolution for Acquiring Compact AI Apr 8, 2020 GPU Model Compression
— Unverified 0Conditional Teacher-Student Learning Apr 28, 2019 Domain Adaptation Model Compression
— Unverified 0Conditional Generative Data-free Knowledge Distillation Dec 31, 2021 Conditional Image Generation Data-free Knowledge Distillation
— Unverified 0From Cloud to Edge: Rethinking Generative AI for Low-Resource Design Challenges Feb 20, 2024 Edge-computing Model Compression
— Unverified 0A Survey on Green Deep Learning Nov 8, 2021 Deep Learning Knowledge Distillation
— Unverified 0Convolutional Neural Network Compression Based on Low-Rank Decomposition Aug 29, 2024 Model Compression Neural Network Compression
— Unverified 0From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks May 9, 2024 Knowledge Distillation Model Compression
— Unverified 0Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models Oct 3, 2024 All Language Modeling
— Unverified 0Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models? Mar 16, 2025 Model Compression Raspberry Pi 4
— Unverified 0Conditional Automated Channel Pruning for Deep Neural Networks Sep 21, 2020 Model Compression
— Unverified 0A flexible, extensible software framework for model compression based on the LC algorithm May 15, 2020 BIG-bench Machine Learning Low-rank compression
— Unverified 0Go Wide, Then Narrow: Efficient Training of Deep Thin Networks Jul 1, 2020 Computational Efficiency Model Compression
— Unverified 0HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks Jan 1, 2022 Model Compression Vocal Bursts Intensity Prediction
— Unverified 0ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval May 28, 2023 Image Retrieval Knowledge Distillation
— Unverified 0GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference Dec 23, 2024 GPU Language Modeling
— Unverified 0Gradient-Free Structured Pruning with Unlabeled Data Mar 7, 2023 GPU Model Compression
— Unverified 0Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures Jan 16, 2025 Model Compression Quantization
— Unverified 0Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks Nov 21, 2018 Knowledge Distillation Model Compression
— Unverified 0Formalizing Generalization and Robustness of Neural Networks to Weight Perturbations Mar 3, 2021 Model Compression
— Unverified 0