Activation Sparsity Opportunities for Compressing General Large Language Models Dec 13, 2024 Model Compression
— Unverified 0Can Students Beyond The Teacher? Distilling Knowledge from Teacher's Bias Dec 13, 2024 Knowledge Distillation Model Compression
— Unverified 0Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices Dec 12, 2024 Knowledge Distillation Mamba
— Unverified 0Low-Rank Correction for Quantized LLMs Dec 10, 2024 Model Compression Quantization
— Unverified 0VQ4ALL: Efficient Neural Network Representation via a Universal Codebook Dec 9, 2024 Density Estimation Efficient Neural Network
— Unverified 0Compression for Better: A General and Stable Lossless Compression Framework Dec 9, 2024 Computational Efficiency Model Compression
— Unverified 0Lossless Model Compression via Joint Low-Rank Factorization Optimization Dec 9, 2024 Model Compression Model Optimization
— Unverified 0Trimming Down Large Spiking Vision Transformers via Heterogeneous Quantization Search Dec 7, 2024 Model Compression Quantization
— Unverified 0CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models Dec 3, 2024 Language Modeling Language Modelling
— Unverified 0Efficient Model Compression Techniques with FishLeg Dec 3, 2024 Meta-Learning model
— Unverified 0Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models Nov 27, 2024 Model Compression Video Generation
— Unverified 0Faithful Label-free Knowledge Distillation Nov 22, 2024 Inductive Bias Knowledge Distillation
Code Code Available 0Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion Nov 22, 2024 Image Generation Model Compression
— Unverified 0TaQ-DiT: Time-aware Quantization for Diffusion Transformers Nov 21, 2024 Denoising Model Compression
— Unverified 0FASTNav: Fine-tuned Adaptive Small-language-models Trained for Multi-point Robot Navigation Nov 20, 2024 Edge-computing Model Compression
— Unverified 0What Makes a Good Dataset for Knowledge Distillation? Nov 19, 2024 Continual Learning Knowledge Distillation
— Unverified 0Puppet-CNN: Input-Adaptive Convolutional Neural Networks with Model Compression using Ordinary Differential Equation Nov 19, 2024 Model Compression
— Unverified 0Bridging the Resource Gap: Deploying Advanced Imitation Learning Models onto Affordable Embedded Platforms Nov 18, 2024 Imitation Learning Model Compression
— Unverified 0An exploration of the effect of quantisation on energy consumption and inference time of StarCoder2 Nov 15, 2024 Model Compression Quantization
Code Code Available 0Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition Nov 14, 2024 Emotion Recognition Model Compression
— Unverified 0Feature Interaction Fusion Self-Distillation Network For CTR Prediction Nov 12, 2024 Click-Through Rate Prediction Knowledge Distillation
— Unverified 0OWLed: Outlier-weighed Layerwise Pruning for Efficient Autonomous Driving Framework Nov 12, 2024 Autonomous Driving Decision Making
Code Code Available 0ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization Nov 12, 2024 Language Modeling Language Modelling
— Unverified 0Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning Nov 12, 2024 Deep Reinforcement Learning Model Compression
— Unverified 0ZipNN: Lossless Compression for AI Models Nov 7, 2024 Model Compression
Code Code Available 3From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models Nov 6, 2024 Model Compression Sentence
— Unverified 0Change Is the Only Constant: Dynamic LLM Slicing based on Layer Redundancy Nov 5, 2024 Model Compression
Code Code Available 0Efficient Model Compression for Bayesian Neural Networks Nov 1, 2024 Deep Learning feature selection
— Unverified 0ML Research Benchmark Oct 29, 2024 Model Compression Navigate
Code Code Available 0LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment Oct 28, 2024 Benchmarking Language Modeling
Code Code Available 1EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation Oct 28, 2024 ARC Math
— Unverified 0A Survey of Small Language Models Oct 25, 2024 Benchmarking Model Compression
— Unverified 0SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Oct 25, 2024 Instruction Following Knowledge Distillation
— Unverified 0Beware of Calibration Data for Pruning Large Language Models Oct 23, 2024 Model Compression
— Unverified 0Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation Oct 23, 2024 Data-free Knowledge Distillation Diversity
Code Code Available 0Self-calibration for Language Model Quantization and Pruning Oct 22, 2024 Language Modeling Language Modelling
— Unverified 0Identifying Sub-networks in Neural Networks via Functionally Similar Representations Oct 21, 2024 Model Compression
— Unverified 0EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search Oct 18, 2024 Model Compression Quantization
Code Code Available 1Preview-based Category Contrastive Learning for Knowledge Distillation Oct 18, 2024 Contrastive Learning Knowledge Distillation
— Unverified 0QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models Oct 14, 2024 Model Compression Tensor Decomposition
Code Code Available 0SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression Oct 12, 2024 Model Compression Natural Language Understanding
Code Code Available 1What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias Oct 10, 2024 Age/Unbiased Fairness
— Unverified 0CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression Oct 10, 2024 Language Modeling Language Modelling
— Unverified 0Large Language Model Compression with Neural Architecture Search Oct 9, 2024 Instruction Following Language Modeling
— Unverified 0QT-DoG: Quantization-aware Training for Domain Generalization Oct 8, 2024 Domain Generalization Model Compression
Code Code Available 1SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching Oct 8, 2024 Model Compression Natural Language Understanding
— Unverified 0ESPACE: Dimensionality Reduction of Activations for Model Compression Oct 7, 2024 Dimensionality Reduction model
— Unverified 0Continuous Approximations for Improving Quantization Aware Training of LLMs Oct 6, 2024 MMLU Model Compression
— Unverified 0Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models Oct 3, 2024 All Language Modeling
— Unverified 0Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression Oct 2, 2024 Language Modeling Language Modelling
Code Code Available 1