ARB-LLM: Alternating Refined Binarizations for Large Language Models Oct 4, 2024 Binarization Quantization
Code Code Available 1Lightweight Diffusion Models for Resource-Constrained Semantic Communication Oct 3, 2024 Quantization Semantic Communication
Code Code Available 1Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices Oct 2, 2024 GPU Language Modeling
Code Code Available 1Search for Efficient Large Language Models Sep 25, 2024 GPU Model Compression
Code Code Available 1BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices Sep 25, 2024 image-classification Image Classification
Code Code Available 1MICSim: A Modular Simulator for Mixed-signal Compute-in-Memory based AI Accelerator Sep 23, 2024 Quantization
Code Code Available 1DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing Sep 12, 2024 Image Generation Quantization
Code Code Available 1BBS: Bi-directional Bit-level Sparsity for Deep Learning Acceleration Sep 8, 2024 Deep Learning Quantization
Code Code Available 1Designing Large Foundation Models for Efficient Training and Inference: A Survey Sep 3, 2024 Knowledge Distillation Model Compression
Code Code Available 1VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization Sep 2, 2024 Anomaly Detection Multi-class Anomaly Detection
Code Code Available 1Hyper-Compression: Model Compression via Hyperfunction Sep 1, 2024 model Model Compression
Code Code Available 11-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit Aug 26, 2024 Quantization
Code Code Available 1Quantization-aware Matrix Factorization for Low Bit Rate Image Compression Aug 22, 2024 Image Compression Quantization
Code Code Available 1Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation Aug 7, 2024 GPU Quantization
Code Code Available 1EC-Guide: A Comprehensive E-Commerce Guide for Instruction Tuning and Quantization Aug 6, 2024 Quantization
Code Code Available 1Pruning Large Language Models with Semi-Structural Adaptive Sparse Training Jul 30, 2024 GPU Knowledge Distillation
Code Code Available 1Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations Jul 19, 2024 CPU Quantization
Code Code Available 1A Benchmark for Gaussian Splatting Compression and Quality Assessment Study Jul 19, 2024 Attribute Data Compression
Code Code Available 1AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer Jul 17, 2024 Instance Segmentation object-detection
Code Code Available 1Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models Jul 16, 2024 Quantization
Code Code Available 1Exploring Quantization for Efficient Pre-Training of Transformer Language Models Jul 16, 2024 Language Modeling Language Modelling
Code Code Available 1PSC: Posterior Sampling-Based Compression Jul 13, 2024 Decoder Image Compression
Code Code Available 1On Exact Bit-level Reversible Transformers Without Changing Architectures Jul 12, 2024 image-classification Image Classification
Code Code Available 1RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Jul 10, 2024 parameter-efficient fine-tuning Quantization
Code Code Available 1Dataset Quantization with Active Learning based Adaptive Sampling Jul 9, 2024 Active Learning Dataset Distillation
Code Code Available 1OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks Jul 7, 2024 Quantization
Code Code Available 1CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs Jul 7, 2024 Contrastive Learning object-detection
Code Code Available 1SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking Jul 5, 2024 Language Modelling Large Language Model
Code Code Available 1QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices Jul 2, 2024 GPU Quantization
Code Code Available 1LLMEasyQuant: Scalable Quantization for Parallel and Distributed LLM Inference Jun 28, 2024 GPU Quantization
Code Code Available 1ViT-1.58b: Mobile Vision Transformers in the 1-bit Era Jun 26, 2024 image-classification Image Classification
Code Code Available 1Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging Jun 24, 2024 MMLU Model Compression
Code Code Available 1ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models Jun 24, 2024 Quantization
Code Code Available 1Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models Jun 18, 2024 Binarization Quantization
Code Code Available 1QTIP: Quantization with Trellises and Incoherence Processing Jun 17, 2024 Decoder Quantization
Code Code Available 1ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking Jun 17, 2024 Model Optimization Quantization
Code Code Available 1Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox Jun 15, 2024 Quantization
Code Code Available 1Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark Jun 12, 2024 Benchmarking Mixture-of-Experts
Code Code Available 12DQuant: Low-bit Post-Training Quantization for Image Super-Resolution Jun 10, 2024 Image Super-Resolution Quantization
Code Code Available 1From Analog to Digital: Multi-Order Digital Joint Coding-Modulation for Semantic Communication Jun 8, 2024 Dimensionality Reduction Quantization
Code Code Available 1QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overhead Jun 5, 2024 Quantization
Code Code Available 1Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning Jun 5, 2024 Quantization Reinforcement Learning (RL)
Code Code Available 1SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining Jun 4, 2024 Quantization Sparse Learning
Code Code Available 1ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation Jun 4, 2024 Quantization Video Generation
Code Code Available 1CE-VAE: Capsule Enhanced Variational AutoEncoder for Underwater Image Enhancement Jun 3, 2024 Image Enhancement Image Generation
Code Code Available 1MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization Jun 2, 2024 Quantization
Code Code Available 1P^2-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer May 30, 2024 Quantization
Code Code Available 14-bit Shampoo for Memory-Efficient Network Training May 28, 2024 image-classification Image Classification
Code Code Available 1Exploiting LLM Quantization May 28, 2024 Code Generation Quantization
Code Code Available 1SLMRec: Distilling Large Language Models into Small for Sequential Recommendation May 28, 2024 Knowledge Distillation Language Modeling
Code Code Available 1