| Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference | Jun 4, 2023 | DecoderKnowledge Distillation | —Unverified | 0 |
| Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression | May 25, 2023 | Multi-Task LearningNeural Network Compression | CodeCode Available | 0 |
| Evaluation Metrics for DNNs Compression | May 18, 2023 | Neural Network CompressionObject | —Unverified | 0 |
| How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression? | May 9, 2023 | Neural Network CompressionTensor Decomposition | —Unverified | 0 |
| Guaranteed Quantization Error Computation for Neural Network Model Compression | Apr 26, 2023 | Model CompressionNeural Network Compression | —Unverified | 0 |
| SwiftTron: An Efficient Hardware Accelerator for Quantized Transformers | Apr 8, 2023 | Neural Network CompressionQuantization | CodeCode Available | 1 |
| WHC: Weighted Hybrid Criterion for Filter Pruning on Convolutional Neural Networks | Feb 16, 2023 | ClassificationNetwork Pruning | CodeCode Available | 0 |
| DepGraph: Towards Any Structural Pruning | Jan 30, 2023 | Network PruningNeural Network Compression | CodeCode Available | 4 |
| Magnitude and Similarity based Variable Rate Filter Pruning for Efficient Convolution Neural Networks | Dec 27, 2022 | Network PruningNeural Network Compression | CodeCode Available | 0 |
| PD-Quant: Post-Training Quantization based on Prediction Difference Metric | Dec 14, 2022 | Neural Network CompressionQuantization | CodeCode Available | 1 |