A Survey on Model Compression for Large Language Models Aug 15, 2023 Benchmarking Knowledge Distillation
— Unverified 00 A Survey on Transformer Compression Feb 5, 2024 Knowledge Distillation Mamba
— Unverified 00 AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations Oct 17, 2024 Decoder Quantization
— Unverified 00 Asymmetric Correlation Quantization Hashing for Cross-modal Retrieval Jan 14, 2020 Cross-Modal Retrieval Quantization
— Unverified 00 Asymmetric Deep Semantic Quantization for Image Retrieval Mar 29, 2019 Image Retrieval Quantization
— Unverified 00 Asymmetric Learned Image Compression with Multi-Scale Residual Block, Importance Map, and Post-Quantization Filtering Jun 21, 2022 Decoder Image Compression
— Unverified 00 Asymmetric Learning Vector Quantization for Efficient Nearest Neighbor Classification in Dynamic Time Warping Spaces Mar 24, 2017 Classification Dynamic Time Warping
— Unverified 00 Asymptotically Optimal Closed-Form Phase Configuration of 1-bit RISs via Sign Alignment Jul 18, 2024 Form Quantization
— Unverified 00 Asymptotic Analysis of One-bit Quantized Box-Constrained Precoding in Large-Scale Multi-User Systems Feb 5, 2025 Quantization
— Unverified 00 Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network May 31, 2023 Quantization
— Unverified 00 Asymptotic stabilization under homomorphic encryption: A re-encryption free method Apr 12, 2025 Quantization
— Unverified 00 Asymptotic tracking control of dynamic reference over homomorphically encrypted data with finite modulus Sep 27, 2024 Quantization
— Unverified 00 Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization Jun 12, 2024 Computational Efficiency Pose Estimation
— Unverified 00 Asynchronous Federated Learning with Bidirectional Quantized Communications and Buffered Aggregation Aug 1, 2023 Federated Learning Quantization
— Unverified 00 A System-Level Solution for Low-Power Object Detection Sep 24, 2019 CPU Object
— Unverified 00 A Targeted Acceleration and Compression Framework for Low bit Neural Networks Jul 9, 2019 Binarization Computational Efficiency
— Unverified 00 ATHEENA: A Toolflow for Hardware Early-Exit Network Automation Apr 17, 2023 Quantization
— Unverified 00 Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information May 24, 2024 Edge-computing Machine Translation
— Unverified 00 A Tiny CNN Architecture for Medical Face Mask Detection for Resource-Constrained Endpoints Nov 30, 2020 Quantization
— Unverified 00 A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays Oct 20, 2021 Continual Learning Quantization
— Unverified 00 Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures Jan 16, 2025 Model Compression Quantization
— Unverified 00 Atomic Anatomy of Low-Inertia Power Systems May 21, 2023 Anatomy Quantization
— Unverified 00 Atrous Space Bender U-Net (ASBU-Net/LogiNet) Dec 16, 2022 Quantization Segmentation
— Unverified 00 Attacking Binarized Neural Networks Nov 1, 2017 Quantization
— Unverified 00 Attention Augmented Convolutional Transformer for Tabular Time-series Oct 5, 2021 Language Modeling Language Modelling
— Unverified 00 Attention-aware Post-training Quantization without Backpropagation Jun 19, 2024 Quantization
— Unverified 00 Attention based on-device streaming speech recognition with large speech corpus Jan 2, 2020 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 00 Attention-based Transducer for Online Speech Recognition May 18, 2020 CPU Decoder
— Unverified 00 Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency Nov 5, 2023 Quantization
— Unverified 00 Attention Round for Post-Training Quantization Jul 7, 2022 Combinatorial Optimization Quantization
— Unverified 00 Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking Apr 5, 2020 Face Alignment Facial Landmark Detection
— Unverified 00 Attribute Artifacts Removal for Geometry-based Point Cloud Compression Dec 1, 2021 Attribute Graph Attention
— Unverified 00 Augmented Deep Unfolding for Downlink Beamforming in Multi-cell Massive MIMO With Limited Feedback Sep 3, 2022 Quantization
— Unverified 00 Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML May 23, 2023 Bayesian Optimization Hyperparameter Optimization
— Unverified 00 Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization Jun 8, 2023 Quantization
— Unverified 00 A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation Apr 2, 2023 Face Generation Knowledge Distillation
— Unverified 00 A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM Nov 5, 2018 Clustering Model Compression
— Unverified 00 A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent May 27, 2019 Quantization
— Unverified 00 A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models Oct 17, 2024 Quantization
— Unverified 00 AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks Jul 8, 2020 image-classification Image Classification
— Unverified 00 Autoencoder-Based Error Correction Coding for One-Bit Quantization Sep 24, 2019 Quantization
— Unverified 00 Autoencoder based image compression: can the learning be quantization independent? Feb 23, 2018 Image Compression Quantization
— Unverified 00 Automated Backend-Aware Post-Training Quantization Mar 27, 2021 CPU Diversity
— Unverified 00 Automated design of error-resilient and hardware-efficient deep neural networks Sep 30, 2019 Autonomous Vehicles Quantization
— Unverified 00 Automated flow for compressing convolution neural networks for efficient edge-computation with FPGA Dec 18, 2017 CPU object-detection
— Unverified 00 Automated Heterogeneous Low-Bit Quantization of Multi-Model Deep Learning Inference Pipeline Nov 10, 2023 Ensemble Learning Multi-Task Learning
— Unverified 00 Automated Linear-Time Detection and Quality Assessment of Superpixels in Uncalibrated True- or False-Color RGB Images Jan 8, 2017 Color Constancy Computational Efficiency
— Unverified 00 Automated Log-Scale Quantization for Low-Cost Deep Neural Networks Jun 19, 2021 Image Enhancement Quantization
— Unverified 00 Automated Model Compression by Jointly Applied Pruning and Quantization Nov 12, 2020 AutoML Model Compression
— Unverified 00 Automated Tomato Maturity Estimation Using an Optimized Residual Model with Pruning and Quantization Techniques Mar 13, 2025 Classification Computational Efficiency
— Unverified 00