Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information May 24, 2024 Edge-computing Machine Translation
— Unverified 0BiSup: Bidirectional Quantization Error Suppression for Large Language Models May 24, 2024 parameter-efficient fine-tuning Quantization
— Unverified 0OAC: Output-adaptive Calibration for Accurate Post-training Quantization May 23, 2024 Quantization
— Unverified 0SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models May 23, 2024 Natural Language Understanding Quantization
Code Code Available 2A rescaling-invariant Lipschitz bound based on path-metrics for modern ReLU network parameterizations May 23, 2024 Generalization Bounds Network Pruning
— Unverified 0ASI++: Towards Distributionally Balanced End-to-End Generative Retrieval May 23, 2024 Information Retrieval Quantization
— Unverified 0Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs May 23, 2024 Quantization
— Unverified 0PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression May 23, 2024 Quantization
Code Code Available 5LG-VQ: Language-Guided Codebook Learning May 23, 2024 Image Captioning Image Generation
— Unverified 0Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs May 23, 2024 Quantization
Code Code Available 0TerDiT: Ternary Diffusion Models with Transformers May 23, 2024 Image Generation Quantization
Code Code Available 2Distilling Vision-Language Pretraining for Efficient Cross-Modal Retrieval May 23, 2024 Cross-Modal Retrieval Quantization
— Unverified 0MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs May 23, 2024 Multivariate Time Series Forecasting Quantization
— Unverified 0Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising May 23, 2024 Denoising Image Generation
— Unverified 0Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models May 23, 2024 Data Compression Image Generation
Code Code Available 1MiniCache: KV Cache Compression in Depth Dimension for Large Language Models May 23, 2024 Quantization
— Unverified 0Embedding Compression for Efficient Re-Identification May 23, 2024 Dimensionality Reduction Quantization
— Unverified 0ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification May 23, 2024 GPU GSM8K
Code Code Available 1AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs May 22, 2024 Privacy Preserving Quantization
— Unverified 0Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing May 22, 2024 Intelligent Communication object-detection
— Unverified 0Communication-Efficient Federated Learning via Clipped Uniform Quantization May 22, 2024 Federated Learning Quantization
Code Code Available 0QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input May 22, 2024 Gait Recognition Quantization
— Unverified 0Discrete Cosine Transform Based Decorrelated Attention for Vision Transformers May 22, 2024 Quantization
— Unverified 0eXmY: A Data Type and Technique for Arbitrary Bit Precision Quantization May 22, 2024 CPU Quantization
— Unverified 0Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation May 22, 2024 CPU object-detection
— Unverified 0ReALLM: A general framework for LLM compression and fine-tuning May 21, 2024 Decoder Quantization
— Unverified 0Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities May 21, 2024 Data Poisoning Intrusion Detection
— Unverified 0Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks May 21, 2024 Quantization
Code Code Available 1On Image Registration and Subpixel Estimation May 21, 2024 Image Registration Quantization
— Unverified 0Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression May 21, 2024 Quantization Tensor Decomposition
Code Code Available 0RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search May 21, 2024 Quantization
Code Code Available 2Online Signature Recognition: A Biologically Inspired Feature Vector Splitting Approach May 21, 2024 Dynamic Time Warping Quantization
— Unverified 0TinyM^2Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment May 20, 2024 Knowledge Distillation Model Compression
— Unverified 0Imp: Highly Capable Large Multimodal Models for Mobile Devices May 20, 2024 Quantization Visual Question Answering
Code Code Available 2Flattened one-bit stochastic gradient descent: compressed distributed optimization with controlled variance May 17, 2024 Distributed Optimization Quantization
— Unverified 0Universal Joint Source-Channel Coding for Modulation-Agnostic Semantic Communication May 17, 2024 Decoder Quantization
— Unverified 0Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network May 17, 2024 Image Compression Quantization
— Unverified 0The Effect of Quantization in Federated Learning: A Rényi Differential Privacy Perspective May 16, 2024 Federated Learning Privacy Preserving
— Unverified 0Deep Learning-Enabled One-Bit DoA Estimation May 15, 2024 compressed sensing Deep Learning
Code Code Available 1Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy May 15, 2024 Federated Learning image-classification
Code Code Available 1Properties that allow or prohibit transferability of adversarial attacks among quantized networks May 15, 2024 Quantization
Code Code Available 0Neural Speech Coding for Real-time Communications using Constant Bitrate Scalar Quantization May 14, 2024 Quantization Scheduling
— Unverified 0FDD Massive MIMO: How to Optimally Combine UL Pilot and Limited DL CSI Feedback? May 14, 2024 Quantization
— Unverified 0VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling May 13, 2024 Quantization
— Unverified 0Goal-oriented compression for L_p-norm-type goal functions: Application to power consumption scheduling May 13, 2024 Data Compression Quantization
— Unverified 0Post Training Quantization of Large Language Models with Microscaling Formats May 12, 2024 Language Modeling Language Modelling
— Unverified 0Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization May 12, 2024 Language Modeling Language Modelling
— Unverified 0Compression-Realized Deep Structural Network for Video Quality Enhancement May 10, 2024 Denoising Motion Estimation
— Unverified 0Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection May 10, 2024 Autonomous Driving GPU
— Unverified 0Characterizing the Accuracy -- Efficiency Trade-off of Low-rank Decomposition in Language Models May 10, 2024 AI Agent Model Compression
— Unverified 0