ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals Dec 18, 2024 Quantization
Code Code Available 1A Survey on Inference Optimization Techniques for Mixture of Experts Models Dec 18, 2024 Computational Efficiency Distributed Computing
Code Code Available 3Autoregressive Video Generation without Vector Quantization Dec 18, 2024 Image Generation Prediction
Code Code Available 4Self-control: A Better Conditional Mechanism for Masked Autoregressive Model Dec 18, 2024 Conditional Image Generation Image Generation
— Unverified 0On the Compression of Language Models for Code: An Empirical Study on CodeBERT Dec 18, 2024 Code Search Code Summarization
— Unverified 0More Tokens, Lower Precision: Towards the Optimal Token-Precision Trade-off in KV Cache Compression Dec 17, 2024 Quantization
— Unverified 0VidTok: A Versatile and Open-Source Video Tokenizer Dec 17, 2024 Quantization SSIM
Code Code Available 3Apollo-Forecast: Overcoming Aliasing and Inference Speed Challenges in Language Models for Time Series Forecasting Dec 16, 2024 Quantization Time Series
— Unverified 0Fast and Slow Gradient Approximation for Binary Neural Network Optimization Dec 16, 2024 Quantization
Code Code Available 0Quantifying Climate Change Impacts on Renewable Energy Generation: A Super-Resolution Recurrent Diffusion Model Dec 16, 2024 Denoising Quantization
— Unverified 0QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models Dec 16, 2024 Bayesian Optimization Quantization
— Unverified 0FinLoRA: Finetuning Quantized Financial Large Language Models Using Low-Rank Adaptation Dec 16, 2024 GPU Information Retrieval
— Unverified 0CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation Dec 16, 2024 Quantization
— Unverified 0Relation-Guided Adversarial Learning for Data-free Knowledge Transfer Dec 16, 2024 Data-free Knowledge Distillation Data Free Quantization
Code Code Available 1MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models Dec 16, 2024 Quantization
Code Code Available 1VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression Dec 16, 2024 NeRF Quantization
— Unverified 0Nanoscaling Floating-Point (NxFP): NanoMantissa, Adaptive Microexponents, and Code Recycling for Direct-Cast Compression of Large Language Models Dec 15, 2024 MMLU Quantization
— Unverified 0ProFe: Communication-Efficient Decentralized Federated Learning via Distillation and Prototypes Dec 15, 2024 Federated Learning Knowledge Distillation
— Unverified 0TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs Dec 15, 2024 Model Compression Quantization
— Unverified 0Efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment Dec 15, 2024 Image Segmentation Medical Image Segmentation
Code Code Available 0Enhancing Off-Grid One-Bit DOA Estimation with Learning-Based Sparse Bayesian Approach for Non-Uniform Sparse Array Dec 14, 2024 Computational Efficiency Quantization
— Unverified 0Adaptive Quantization Resolution and Power Control for Federated Learning over Cell-free Networks Dec 14, 2024 Federated Learning Quantization
— Unverified 0TinySubNets: An efficient and low capacity continual learning strategy Dec 14, 2024 Continual Learning Quantization
Code Code Available 0Memory-Efficient 4-bit Preconditioned Stochastic Optimization Dec 14, 2024 Quantization Stochastic Optimization
— Unverified 0Progressive Compression with Universally Quantized Diffusion Models Dec 14, 2024 Image Compression Image Generation
— Unverified 0Efficient Generative Modeling with Residual Vector Quantization-Based Tokens Dec 13, 2024 Conditional Image Generation Image Generation
— Unverified 0VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization Dec 13, 2024 Face Generation Motion Generation
— Unverified 0TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation Dec 13, 2024 Domain Adaptation Quantization
— Unverified 0MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization Dec 13, 2024 image-classification Image Classification
— Unverified 0SCBench: A KV Cache-Centric Analysis of Long-Context Methods Dec 13, 2024 Mamba Quantization
Code Code Available 5Panacea: Novel DNN Accelerator using Accuracy-Preserving Asymmetric Quantization and Energy-Saving Bit-Slice Sparsity Dec 13, 2024 Quantization
— Unverified 0CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models Dec 13, 2024 In-Context Learning Quantization
Code Code Available 11DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations Dec 12, 2024 image-classification Image Classification
— Unverified 0Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries Dec 12, 2024 4k GSM8K
Code Code Available 1CRVQ: Channel-relaxed Vector Quantization for Extreme Compression of LLMs Dec 12, 2024 Quantization
— Unverified 0Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on Edge Devices Dec 12, 2024 Knowledge Distillation Mamba
— Unverified 0On Round-Off Errors and Gaussian Blur in Superresolution and in Image Registration Dec 12, 2024 Image Registration Quantization
— Unverified 0Breaking the Bias: Recalibrating the Attention of Industrial Anomaly Detection Dec 11, 2024 Anomaly Detection Computational Efficiency
— Unverified 0TurboAttention: Efficient Attention Approximation For High Throughputs LLMs Dec 11, 2024 Computational Efficiency Language Modeling
— Unverified 0Low-Rank Correction for Quantized LLMs Dec 10, 2024 Model Compression Quantization
— Unverified 0Machine learning-driven conservative-to-primitive conversion in hybrid piecewise polytropic and tabulated equations of state Dec 10, 2024 CPU GPU
— Unverified 0Post-Training Non-Uniform Quantization for Convolutional Neural Networks Dec 10, 2024 image-classification Image Classification
— Unverified 0QuantFormer: Learning to Quantize for Neural Activity Forecasting in Mouse Visual Cortex Dec 10, 2024 Quantization
— Unverified 0FP=xINT:A Low-Bit Series Expansion Algorithm for Post-Training Quantization Dec 9, 2024 Quantization
— Unverified 0Compression for Better: A General and Stable Lossless Compression Framework Dec 9, 2024 Computational Efficiency Model Compression
— Unverified 0Federated Split Learning with Model Pruning and Gradient Quantization in Wireless Networks Dec 9, 2024 Federated Learning Quantization
— Unverified 0Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion Dec 9, 2024 Denoising Image Generation
— Unverified 0Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization Dec 8, 2024 Quantization
— Unverified 0Fuzzy Norm-Explicit Product Quantization for Recommender Systems Dec 8, 2024 Quantization Recommendation Systems
— Unverified 0SizeGS: Size-aware Compression of 3D Gaussians with Hierarchical Mixed Precision Quantization Dec 8, 2024 3DGS Attribute
— Unverified 0