Aerial Image Classification in Scarce and Unconstrained Environments via Conformal Prediction Apr 24, 2025 Conformal Prediction image-classification
— Unverified 0From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs Apr 18, 2025 Knowledge Distillation Model Compression
— Unverified 0ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs Apr 17, 2025 Model Compression Quantization
Code Code Available 0D^2MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving Apr 17, 2025 Mixture-of-Experts Model Compression
— Unverified 0Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Apr 15, 2025 Knowledge Distillation Language Modeling
— Unverified 0APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-Design Apr 10, 2025 Model Compression Quantization
Code Code Available 0Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models Apr 7, 2025 Adversarial Robustness Diversity
— Unverified 0Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression Apr 6, 2025 Computational Efficiency Language Modeling
Code Code Available 0Compression Laws for Large Language Models Apr 6, 2025 Model Compression
— Unverified 0RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation Apr 4, 2025 Change Detection Depth Estimation
— Unverified 0Compositionality Unlocks Deep Interpretable Models Apr 3, 2025 Model Compression Tensor Networks
— Unverified 0Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression Apr 2, 2025 Denoising Knowledge Distillation
— Unverified 0Penrose Tiled Low-Rank Compression and Section-Wise Q&A Fine-Tuning: A General Framework for Domain-Specific Large Language Model Adaptation Mar 28, 2025 Language Modeling Language Modelling
— Unverified 0Multi-Task Semantic Communications via Large Models Mar 28, 2025 Model Compression Retrieval-augmented Generation
— Unverified 0Delving Deep into Semantic Relation Distillation Mar 27, 2025 Knowledge Distillation Model Compression
— Unverified 0MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness Mar 27, 2025 Language Modeling Language Modelling
— Unverified 0Boosting Large Language Models with Mask Fine-Tuning Mar 27, 2025 Language Modeling Language Modelling
Code Code Available 0Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration Mar 27, 2025 Computational Efficiency Image Restoration
— Unverified 0A Low-Power Streaming Speech Enhancement Accelerator For Edge Devices Mar 27, 2025 Model Compression Speech Enhancement
— Unverified 0Temporal Action Detection Model Compression by Progressive Block Drop Mar 21, 2025 Action Detection Autonomous Driving
— Unverified 0Large Language Model Compression via the Nested Activation-Aware Decomposition Mar 21, 2025 Language Modeling Language Modelling
— Unverified 0InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer Mar 20, 2025 Knowledge Distillation Model Compression
— Unverified 0CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting Mar 17, 2025 3DGS 3D Reconstruction
— Unverified 0ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning Mar 17, 2025 GPU Model Compression
— Unverified 0Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models? Mar 16, 2025 Model Compression Raspberry Pi 4
— Unverified 0Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge Mar 12, 2025 CPU GPU
— Unverified 0Position-Aware Depth Decay Decoding (D^3): Boosting Large Language Model Inference Efficiency Mar 11, 2025 GSM8K Language Modeling
— Unverified 0Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices Mar 10, 2025 CPU GPU
— Unverified 0Towards Superior Quantization Accuracy: A Layer-sensitive Approach Mar 9, 2025 Logical Reasoning Model Compression
— Unverified 0IteRABRe: Iterative Recovery-Aided Block Reduction Mar 8, 2025 Model Compression
— Unverified 0ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation Mar 8, 2025 Autonomous Driving feature selection
— Unverified 0Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models Mar 8, 2025 Edge-computing Model Compression
— Unverified 0CASP: Compression of Large Multimodal Models Based on Attention Sparsity Mar 7, 2025 Model Compression Quantization
Code Code Available 0TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Mar 6, 2025 Model Compression Transfer Learning
— Unverified 0LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression Mar 6, 2025 Benchmarking Common Sense Reasoning
Code Code Available 010K is Enough: An Ultra-Lightweight Binarized Network for Infrared Small-Target Detection Mar 4, 2025 Binarization Model Compression
— Unverified 0Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models Feb 27, 2025 Knowledge Distillation Model Compression
— Unverified 0Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies Feb 26, 2025 image-classification Image Classification
— Unverified 0AfroXLMR-Comet: Multilingual Knowledge Distillation with Attention Matching for Low-Resource languages Feb 25, 2025 Knowledge Distillation Language Modeling
— Unverified 0The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Feb 24, 2025 Arithmetic Reasoning Common Sense Reasoning
— Unverified 0Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs Feb 23, 2025 Data Poisoning Diagnostic
— Unverified 0When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models Feb 21, 2025 Model Compression Quantization
— Unverified 0Optimizing Singular Spectrum for Large Language Model Compression Feb 20, 2025 Language Modeling Language Modelling
— Unverified 0Efficient AI in Practice: Training and Deployment of Efficient LLMs for Industry Applications Feb 20, 2025 Knowledge Distillation Model Compression
— Unverified 0Vision Foundation Models in Medical Image Analysis: Advances and Challenges Feb 20, 2025 Domain Adaptation Federated Learning
— Unverified 0MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures Feb 19, 2025 Model Compression
— Unverified 0Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models Feb 18, 2025 Knowledge Distillation Mixture-of-Experts
— Unverified 0OPTISHEAR: Towards Efficient and Adaptive Pruning of Large Language Models via Evolutionary Optimization Feb 15, 2025 Model Compression
— Unverified 0Vision-Language Models for Edge Networks: A Comprehensive Survey Feb 11, 2025 Autonomous Vehicles Image Captioning
— Unverified 0Runtime Tunable Tsetlin Machines for Edge Inference on eFPGAs Feb 10, 2025 Model Compression Resynthesis
— Unverified 0