Greener yet Powerful: Taming Large Code Generation Models with Quantization Mar 9, 2023 Code Generation Code Summarization
— Unverified 0Group channel pruning and spatial attention distilling for object detection Jun 2, 2023 Knowledge Distillation Model Compression
— Unverified 0GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking Jun 18, 2018 Language Modeling Language Modelling
— Unverified 0Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks Jun 14, 2022 Atrial Fibrillation Detection Model Compression
— Unverified 0Conditional Automated Channel Pruning for Deep Neural Networks Sep 21, 2020 Model Compression
— Unverified 0HadaNets: Flexible Quantization Strategies for Neural Networks May 26, 2019 Model Compression Quantization
— Unverified 0HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks Jan 20, 2023 GPU Low-rank compression
— Unverified 0A flexible, extensible software framework for model compression based on the LC algorithm May 15, 2020 BIG-bench Machine Learning Low-rank compression
— Unverified 0HCE: Improving Performance and Efficiency with Heterogeneously Compressed Neural Network Ensemble Jan 18, 2023 Diversity Ensemble Learning
— Unverified 0Investigation of Practical Aspects of Single Channel Speech Separation for ASR Jul 5, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0HFSP: A Hardware-friendly Soft Pruning Framework for Vision Transformers Sep 29, 2021 image-classification Image Classification
— Unverified 0HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask Jun 9, 2022 Federated Learning Model Compression
— Unverified 0Cross-Channel Intragroup Sparsity Neural Network Oct 26, 2019 Model Compression Network Pruning
— Unverified 0Attention Sinks and Outlier Features: A 'Catch, Tag, and Release' Mechanism for Embeddings Feb 2, 2025 Model Compression TAG
— Unverified 0ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval May 28, 2023 Image Retrieval Knowledge Distillation
— Unverified 0HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks Jan 1, 2022 Model Compression Vocal Bursts Intensity Prediction
— Unverified 0Formalizing Generalization and Robustness of Neural Networks to Weight Perturbations Mar 3, 2021 Model Compression
— Unverified 0How and When Adversarial Robustness Transfers in Knowledge Distillation? Oct 22, 2021 Adversarial Robustness Knowledge Distillation
— Unverified 0Aerial Image Classification in Scarce and Unconstrained Environments via Conformal Prediction Apr 24, 2025 Conformal Prediction image-classification
— Unverified 0Deep Face Recognition Model Compression via Knowledge Transfer and Distillation Jun 3, 2019 Face Recognition Knowledge Distillation
— Unverified 0How to Explain Neural Networks: an Approximation Perspective May 17, 2021 Model Compression
— Unverified 0How to Select One Among All ? An Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding Nov 1, 2021 Adversarial Robustness All
— Unverified 0Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations Dec 1, 2021 Adversarial Robustness Model Compression
— Unverified 0Redundancy and Concept Analysis for Code-trained Language Models May 1, 2023 Memorization Model Compression
— Unverified 0CURing Large Models: Compression via CUR Decomposition Jan 8, 2025 Model Compression
— Unverified 0Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference Feb 2, 2025 Model Compression Quantization
— Unverified 0SwiftPrune: Hessian-Free Weight Pruning for Large Language Models Jan 24, 2025 Model Compression Quantization
— Unverified 0D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving Apr 17, 2025 Mixture-of-Experts Model Compression
— Unverified 0FoldGPT: Simple and Effective Large Language Model Compression Scheme Jul 1, 2024 Language Modeling Language Modelling
— Unverified 0DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks Nov 19, 2019 Model Compression Network Pruning
— Unverified 0ICD-Face: Intra-class Compactness Distillation for Face Recognition Jan 1, 2023 Face Recognition Knowledge Distillation
— Unverified 0Identifying Sub-networks in Neural Networks via Functionally Similar Representations Oct 21, 2024 Model Compression
— Unverified 0ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA Oct 30, 2021 Edge-computing Model Compression
— Unverified 0DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer Jul 5, 2017 Clustering Image Clustering
— Unverified 0FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks Nov 7, 2018 GPU image-classification
— Unverified 0Impact of Disentanglement on Pruning Neural Networks Jul 19, 2023 Disentanglement Model Compression
— Unverified 0Implicit Neural Representation for Videos Based on Residual Connection Jun 15, 2024 Image Reconstruction Model Compression
— Unverified 0A Survey on Drowsiness Detection -- Modern Applications and Methods Aug 23, 2024 Model Compression Survey
— Unverified 0Computation-efficient Deep Learning for Computer Vision: A Survey Aug 27, 2023 Autonomous Vehicles Deep Learning
— Unverified 0A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation Mar 12, 2017 Classification Domain Adaptation
— Unverified 0Improve Knowledge Distillation via Label Revision and Data Selection Apr 3, 2024 Knowledge Distillation Model Compression
— Unverified 0Interpreting Deep Classifier by Visual Distillation of Dark Knowledge Mar 11, 2018 Dimensionality Reduction Model Compression
— Unverified 0Intrinsically Sparse Long Short-Term Memory Networks Jan 26, 2019 Model Compression Sentiment Analysis
— Unverified 0Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning Aug 26, 2023 Knowledge Distillation Model Compression
— Unverified 0Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing May 22, 2025 Model Compression Neural Network Compression
— Unverified 0FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks Dec 29, 2022 Model Compression Quantization
— Unverified 0FIT: A Metric for Model Sensitivity Oct 16, 2022 model Model Compression
— Unverified 0In defense of parameter sharing for model-compression Oct 17, 2023 Model Compression
— Unverified 0Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models Nov 27, 2024 Model Compression Video Generation
— Unverified 0Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead Jun 17, 2024 GPU Model Compression
— Unverified 0