LLM Inference Unveiled: Survey and Roofline Model Insights Feb 26, 2024 Knowledge Distillation Language Modelling
Code Code Available 4Model Compression Method for S4 with Diagonal State Space Layers using Balanced Truncation Feb 25, 2024 Model Compression
— Unverified 0FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing Feb 21, 2024 GPU Model Compression
— Unverified 0PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning Feb 20, 2024 Instruction Following Knowledge Distillation
Code Code Available 1From Cloud to Edge: Rethinking Generative AI for Low-Resource Design Challenges Feb 20, 2024 Edge-computing Model Compression
— Unverified 0A Survey on Knowledge Distillation of Large Language Models Feb 20, 2024 Data Augmentation Knowledge Distillation
Code Code Available 5Towards a tailored mixed-precision sub-8-bit quantization scheme for Gated Recurrent Units using Genetic Algorithms Feb 19, 2024 Model Compression Quantization
— Unverified 0Extraction of nonlinearity in neural networks with Koopman operator Feb 18, 2024 Model Compression
— Unverified 0Model Compression and Efficient Inference for Large Language Models: A Survey Feb 15, 2024 Knowledge Distillation Model Compression
— Unverified 0Fast Vocabulary Transfer for Language Model Compression Feb 15, 2024 Language Modeling Language Modelling
Code Code Available 1Bayesian Deep Learning Via Expectation Maximization and Turbo Deep Approximate Message Passing Feb 12, 2024 Bayesian Inference Federated Learning
— Unverified 0Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy Feb 8, 2024 Model Compression
— Unverified 0L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models Feb 7, 2024 Few-Shot Learning In-Context Learning
— Unverified 0The Potential of AutoML for Recommender Systems Feb 6, 2024 AutoML Machine Translation
— Unverified 0Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes Feb 6, 2024 Federated Learning Model Compression
— Unverified 0Expediting In-Network Federated Learning by Voting-Based Consensus Model Compression Feb 6, 2024 Federated Learning Model Compression
— Unverified 0QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning Feb 6, 2024 Image Generation Model Compression
Code Code Available 2A Survey on Transformer Compression Feb 5, 2024 Knowledge Distillation Mamba
— Unverified 0Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation Feb 5, 2024 Model Compression Recommendation Systems
— Unverified 0Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward Feb 2, 2024 Model Compression Survey
Code Code Available 1Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models Feb 2, 2024 Image Generation Model Compression
— Unverified 0Effective Multi-Stage Training Model For Edge Computing Devices In Intrusion Detection Jan 31, 2024 Edge-computing Intrusion Detection
— Unverified 0EPSD: Early Pruning with Self-Distillation for Efficient Model Compression Jan 31, 2024 Knowledge Distillation Model Compression
— Unverified 0RADIN: Souping on a Budget Jan 31, 2024 Ensemble Learning Model Compression
— Unverified 0Diffusion Model Compression for Image-to-Image Translation Jan 31, 2024 Conditional Image Generation Denoising
— Unverified 0SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget Jan 30, 2024 GPU Model Compression
— Unverified 0LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 2TQCompressor: improving tensor decomposition methods in neural networks via permutations Jan 29, 2024 Knowledge Distillation Model Compression
Code Code Available 0CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks Jan 25, 2024 Model Compression Quantization
— Unverified 0Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation Jan 25, 2024 Clustering Federated Learning
Code Code Available 1Large receptive field strategy and important feature extraction strategy in 3D object detection Jan 22, 2024 3D Object Detection Autonomous Driving
— Unverified 0Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning Jan 19, 2024 Model Compression
Code Code Available 0ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks Jan 18, 2024 Low-rank compression Model Compression
— Unverified 0SymbolNet: Neural Symbolic Regression with Adaptive Dynamic Pruning for Compression Jan 18, 2024 Jet Tagging Model Compression
Code Code Available 0Model Compression Techniques in Biometrics Applications: A Survey Jan 18, 2024 Fairness Knowledge Distillation
Code Code Available 0Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices Jan 17, 2024 Dynamic neural networks GPU
Code Code Available 1Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning Jan 15, 2024 Model Compression Neural Network Compression
— Unverified 0Knowledge Translation: A New Pathway for Model Compression Jan 11, 2024 Data Augmentation model
Code Code Available 0FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference Jan 8, 2024 GPU Language Modeling
— Unverified 0Understanding LLMs: A Comprehensive Overview from Training to Inference Jan 4, 2024 Language Modeling Language Modelling
— Unverified 0Retraining-free Model Quantization via One-Shot Weight-Coupling Learning Jan 3, 2024 Model Compression Quantization
Code Code Available 1Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment Jan 2, 2024 Inference Attack Membership Inference Attack
Code Code Available 0Data-Free Quantization via Pseudo-label Filtering Jan 1, 2024 Data Free Quantization Model Compression
— Unverified 0Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection Jan 1, 2024 feature selection Model Compression
— Unverified 0Explainability-Driven Leaf Disease Classification Using Adversarial Training and Knowledge Distillation Dec 30, 2023 Adversarial Attack Classification
— Unverified 0DMT: Comprehensive Distillation with Multiple Self-supervised Teachers Dec 19, 2023 Contrastive Learning Model Compression
— Unverified 0Integrating Fairness and Model Pruning Through Bi-level Optimization Dec 15, 2023 Fairness Model Compression
— Unverified 0Generative Model-based Feature Knowledge Distillation for Action Recognition Dec 14, 2023 Action Detection Action Recognition
Code Code Available 1RankDVQA-mini: Knowledge Distillation-Driven Deep Video Quality Assessment Dec 14, 2023 Knowledge Distillation Model Compression
— Unverified 0Unraveling Key Factors of Knowledge Distillation Dec 14, 2023 Knowledge Distillation Machine Translation
— Unverified 0