Divergent Token Metrics: Measuring degradation to prune away LLM components -- and optimize quantization Nov 2, 2023 Management Model Compression
— Unverified 0BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables Sep 7, 2021 Model Compression
— Unverified 0An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation Feb 28, 2020 Knowledge Distillation Memorization
— Unverified 0AdaKD: Dynamic Knowledge Distillation of ASR models using Adaptive Loss Weighting May 11, 2024 Knowledge Distillation Model Compression
— Unverified 0An Effective Information Theoretic Framework for Channel Pruning Aug 14, 2024 Model Compression
— Unverified 0Distilling with Performance Enhanced Students Oct 24, 2018 Model Compression
— Unverified 0Distributed Low Precision Training Without Mixed Precision Nov 18, 2019 GPU Model Compression
— Unverified 0DKM: Differentiable K-Means Clustering Layer for Neural Network Compression Aug 28, 2021 Clustering Model Compression
— Unverified 0DMT: Comprehensive Distillation with Multiple Self-supervised Teachers Dec 19, 2023 Contrastive Learning Model Compression
— Unverified 0Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures Apr 25, 2023 Model Compression Network Pruning
— Unverified 0An Automatic and Efficient BERT Pruning for Edge AI Systems Jun 21, 2022 CPU Model Compression
— Unverified 0Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models Feb 27, 2025 Knowledge Distillation Model Compression
— Unverified 0Analysis of Quantization on MLP-based Vision Models Sep 14, 2022 Model Compression Quantization
— Unverified 0AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles Jun 8, 2020 Model Compression
— Unverified 0Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments Aug 6, 2024 image-classification Image Classification
— Unverified 0Beware of Calibration Data for Pruning Large Language Models Oct 23, 2024 Model Compression
— Unverified 0Analysis of memory consumption by neural networks based on hyperparameters Oct 21, 2021 Deep Learning Model Compression
— Unverified 0Benchmarking Adversarial Robustness of Compressed Deep Learning Models Aug 16, 2023 Adversarial Robustness Benchmarking
— Unverified 0An Algorithm-Hardware Co-Optimized Framework for Accelerating N:M Sparse Transformers Aug 12, 2022 Computational Efficiency Model Compression
— Unverified 0ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation Mar 8, 2025 Autonomous Driving feature selection
— Unverified 0BD-KD: Balancing the Divergences for Online Knowledge Distillation Dec 25, 2022 Knowledge Distillation Model Compression
— Unverified 0An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design Aug 2, 2024 Model Compression Neural Network Compression
— Unverified 0Activation Sparsity Opportunities for Compressing General Large Language Models Dec 13, 2024 Model Compression
— Unverified 0Bayesian Federated Model Compression for Communication and Computation Efficiency Apr 11, 2024 Bayesian Inference Federated Learning
— Unverified 0Bayesian Deep Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing Feb 12, 2024 Bayesian Inference Federated Learning
— Unverified 0A Model Compression Method with Matrix Product Operators for Speech Enhancement Oct 10, 2020 Model Compression Speech Enhancement
— Unverified 0A Mixed Integer Programming Approach for Verifying Properties of Binarized Neural Networks Mar 11, 2022 Collision Avoidance Model Compression
— Unverified 0Balancing Specialization, Generalization, and Compression for Detection and Tracking Sep 25, 2019 Model Compression
— Unverified 0Balancing Cost and Benefit with Tied-Multi Transformers Feb 20, 2020 Decoder Knowledge Distillation
— Unverified 0Activation Map Adaptation for Effective Knowledge Distillation Oct 26, 2020 Knowledge Distillation Model Compression
— Unverified 0Single-path Bit Sharing for Automatic Loss-aware Model Compression Jan 13, 2021 Model Compression Network Pruning
— Unverified 0Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression Sep 30, 2023 Inductive Bias Knowledge Distillation
— Unverified 0Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation Apr 20, 2020 3D Shape Classification 3D Shape Retrieval
— Unverified 0A Memory-Efficient Learning Framework for SymbolLevel Precoding with Quantized NN Weights Oct 13, 2021 Model Compression Quantization
— Unverified 0AMD: Automatic Multi-step Distillation of Large-scale Vision Models Jul 5, 2024 image-classification Image Classification
— Unverified 0Deep Model Compression Via Two-Stage Deep Reinforcement Learning Dec 4, 2019 Autonomous Driving Deep Reinforcement Learning
— Unverified 0Deep Model Compression: Distilling Knowledge from Noisy Teachers Oct 30, 2016 Model Compression
— Unverified 0Deep Model Compression based on the Training History Jan 30, 2021 model Model Compression
— Unverified 0A Web-Based Solution for Federated Learning with LLM-Based Automation Aug 23, 2024 CPU Federated Learning
— Unverified 0AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates Jul 6, 2019 Deep Reinforcement Learning Heuristic Search
— Unverified 0Deep learning model compression using network sensitivity and gradients Oct 11, 2022 Deep Learning Model Compression
— Unverified 0AMD: Adaptive Masked Distillation for Object Detection Jan 31, 2023 Knowledge Distillation Model Compression
— Unverified 0DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization May 21, 2018 Action Recognition General Classification
— Unverified 0Activation Density based Mixed-Precision Quantization for Energy Efficient Neural Networks Jan 12, 2021 Model Compression Quantization
— Unverified 0Discrete Model Compression With Resource Constraint for Deep Neural Networks Jun 1, 2020 Model Compression
— Unverified 0Neural Epitome Search for Architecture-Agnostic Network Compression Jul 12, 2019 channel selection Model Compression
— Unverified 0AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent Jun 11, 2025 Model Compression Quantization
— Unverified 0DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices Aug 16, 2017 CPU Model Compression
— Unverified 0Automatic Mixed-Precision Quantization Search of BERT Dec 30, 2021 Knowledge Distillation Model Compression
— Unverified 0Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes Jan 18, 2021 Clustering Fault Detection
— Unverified 0