Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity Jun 5, 2024 GPU Quantization
— Unverified 0Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation Jun 4, 2024 Computational Efficiency Edge-computing
— Unverified 0Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression Jun 3, 2024 Knowledge Distillation Quantization
— Unverified 0Log-Scale Quantization in Distributed First-Order Methods: Gradient-based Learning from Distributed Data Jun 2, 2024 Distributed Optimization Quantization
— Unverified 0Privacy-Aware Randomized Quantization via Linear Programming Jun 1, 2024 Quantization
Code Code Available 0LCQ: Low-Rank Codebook based Quantization for Large Language Models May 31, 2024 Model Compression Quantization
— Unverified 0Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs May 31, 2024 Quantization
— Unverified 0Effective Interplay between Sparsity and Quantization: From Theory to Practice May 31, 2024 Computational Efficiency Model Compression
— Unverified 0Locking Machine Learning Models into Hardware May 31, 2024 Quantization
— Unverified 0HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization May 30, 2024 Quantization
— Unverified 0An Efficient Network with Novel Quantization Designed for Massive MIMO CSI Feedback May 30, 2024 Quantization
— Unverified 0One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments May 30, 2024 All Quantization
— Unverified 0S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs May 30, 2024 GPU Quantization
— Unverified 0Information Entropy Guided Height-aware Histogram for Quantization-friendly Pillar Feature Encoder May 29, 2024 3D Object Detection Autonomous Driving
— Unverified 0LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models May 28, 2024 Neural Architecture Search Quantization
— Unverified 0MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization May 28, 2024 Denoising Quantization
— Unverified 0I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models May 28, 2024 Quantization
— Unverified 0The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention May 28, 2024 object-detection Object Detection
— Unverified 0Di^2Pose: Discrete Diffusion Model for Occluded 3D Human Pose Estimation May 27, 2024 3D Human Pose Estimation Monocular 3D Human Pose Estimation
— Unverified 0CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs May 27, 2024 Computational Efficiency Quantization
Code Code Available 0UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation May 27, 2024 Image Compression Knowledge Distillation
— Unverified 0BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics May 27, 2024 Decoder Quantization
— Unverified 0FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference May 25, 2024 Quantization
— Unverified 0Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information May 24, 2024 Edge-computing Machine Translation
— Unverified 0BiSup: Bidirectional Quantization Error Suppression for Large Language Models May 24, 2024 parameter-efficient fine-tuning Quantization
— Unverified 0Massive MIMO-ISAC System With 1-Bit ADCs/DACs May 24, 2024 Integrated sensing and communication ISAC
— Unverified 0MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs May 23, 2024 Multivariate Time Series Forecasting Quantization
— Unverified 0Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs May 23, 2024 Quantization
— Unverified 0Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs May 23, 2024 Quantization
Code Code Available 0ASI++: Towards Distributionally Balanced End-to-End Generative Retrieval May 23, 2024 Information Retrieval Quantization
— Unverified 0Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising May 23, 2024 Denoising Image Generation
— Unverified 0OAC: Output-adaptive Calibration for Accurate Post-training Quantization May 23, 2024 Quantization
— Unverified 0A rescaling-invariant Lipschitz bound based on path-metrics for modern ReLU network parameterizations May 23, 2024 Generalization Bounds Network Pruning
— Unverified 0Embedding Compression for Efficient Re-Identification May 23, 2024 Dimensionality Reduction Quantization
— Unverified 0MiniCache: KV Cache Compression in Depth Dimension for Large Language Models May 23, 2024 Quantization
— Unverified 0Distilling Vision-Language Pretraining for Efficient Cross-Modal Retrieval May 23, 2024 Cross-Modal Retrieval Quantization
— Unverified 0LG-VQ: Language-Guided Codebook Learning May 23, 2024 Image Captioning Image Generation
— Unverified 0AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs May 22, 2024 Privacy Preserving Quantization
— Unverified 0eXmY: A Data Type and Technique for Arbitrary Bit Precision Quantization May 22, 2024 CPU Quantization
— Unverified 0Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing May 22, 2024 Intelligent Communication object-detection
— Unverified 0QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input May 22, 2024 Gait Recognition Quantization
— Unverified 0Communication-Efficient Federated Learning via Clipped Uniform Quantization May 22, 2024 Federated Learning Quantization
Code Code Available 0Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation May 22, 2024 CPU object-detection
— Unverified 0Discrete Cosine Transform Based Decorrelated Attention for Vision Transformers May 22, 2024 Quantization
— Unverified 0Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities May 21, 2024 Data Poisoning Intrusion Detection
— Unverified 0ReALLM: A general framework for LLM compression and fine-tuning May 21, 2024 Decoder Quantization
— Unverified 0On Image Registration and Subpixel Estimation May 21, 2024 Image Registration Quantization
— Unverified 0Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression May 21, 2024 Quantization Tensor Decomposition
Code Code Available 0Online Signature Recognition: A Biologically Inspired Feature Vector Splitting Approach May 21, 2024 Dynamic Time Warping Quantization
— Unverified 0TinyM^2Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment May 20, 2024 Knowledge Distillation Model Compression
— Unverified 0