Pyramid Vector Quantization for LLMs Oct 22, 2024 Quantization
— Unverified 0Self-calibration for Language Model Quantization and Pruning Oct 22, 2024 Language Modeling Language Modelling
— Unverified 0Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ? Oct 22, 2024 Machine Translation Quantization
— Unverified 0Continuous Speech Synthesis using per-token Latent Diffusion Oct 21, 2024 Image Generation Quantization
— Unverified 0Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces Oct 21, 2024 Continual Learning Lifelong learning
— Unverified 0LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec Oct 21, 2024 Disentanglement Language Modeling
— Unverified 0Large Deviation Upper Bounds and Improved MSE Rates of Nonlinear SGD: Heavy-tailed Noise and Power of Symmetry Oct 21, 2024 Quantization
— Unverified 0SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training Oct 20, 2024 Quantization
— Unverified 0Lossless KV Cache Compression to 2% Oct 20, 2024 Dimensionality Reduction Quantization
— Unverified 0Understanding the Difficulty of Low-Precision Post-Training Quantization for LLMs Oct 18, 2024 Quantization
— Unverified 0Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language Benchmarks Oct 18, 2024 Code Generation GPU
Code Code Available 0AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations Oct 17, 2024 Decoder Quantization
— Unverified 0Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees Oct 17, 2024 Quantization
— Unverified 0Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching Oct 17, 2024 GPU Quantization
— Unverified 0Progressive Mixed-Precision Decoding for Efficient LLM Inference Oct 17, 2024 Quantization
— Unverified 0A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models Oct 17, 2024 Quantization
— Unverified 0Optimal Quantization for Matrix Multiplication Oct 17, 2024 Quantization
Code Code Available 0DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech Oct 17, 2024 Disentanglement Quantization
— Unverified 0COMET: Towards Partical W4A4KV4 LLMs Serving Oct 16, 2024 Quantization Scheduling
— Unverified 0ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs Oct 16, 2024 Diversity Online Clustering
— Unverified 0Channel-Wise Mixed-Precision Quantization for Large Language Models Oct 16, 2024 Quantization
— Unverified 0FairGLVQ: Fairness in Partition-Based Classification Oct 16, 2024 Classification Fairness
Code Code Available 0DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs Oct 16, 2024 Quantization
Code Code Available 0QSpec: Speculative Decoding with Complementary Quantization Schemes Oct 15, 2024 Quantization
— Unverified 0Efficiera Residual Networks: Hardware-Friendly Fully Binary Weight with 2-bit Activation Model Achieves Practical ImageNet Accuracy Oct 15, 2024 Binarization Classification with Binary Weight Network
Code Code Available 0Scaling Laws for Post Training Quantized Large Language Models Oct 15, 2024 Quantization
— Unverified 0Real-Time Stress Detection via Photoplethysmogram Signals: Implementation of a Combined Continuous Wavelet Transform and Convolutional Neural Network on Resource-Constrained Microcontrollers Oct 14, 2024 Quantization
— Unverified 0SLaNC: Static LayerNorm Calibration Oct 14, 2024 Quantization
— Unverified 0Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior Oct 14, 2024 Quantization
— Unverified 0GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation Oct 13, 2024 3D Generation Quantization
— Unverified 0Gradient-Free Neural Network Training on the Edge Oct 13, 2024 Quantization
— Unverified 0PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization Oct 12, 2024 Quantization
— Unverified 0ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification Oct 11, 2024 MME Quantization
— Unverified 0QEFT: Quantization for Efficient Fine-Tuning of LLMs Oct 11, 2024 parameter-efficient fine-tuning Quantization
Code Code Available 0DeltaDQ: Ultra-High Delta Compression for Fine-Tuned LLMs via Group-wise Dropout and Separate Quantization Oct 11, 2024 Diversity Quantization
— Unverified 0Scalable Representation Learning for Multimodal Tabular Transactions Oct 10, 2024 Decoder Quantization
— Unverified 0ACCEPT: Adaptive Codebook for Composite and Efficient Prompt Tuning Oct 10, 2024 Natural Language Understanding parameter-efficient fine-tuning
Code Code Available 0M^2-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization Oct 10, 2024 Efficient ViTs Quantization
— Unverified 0MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion Oct 10, 2024 Denoising parameter-efficient fine-tuning
Code Code Available 0DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Oct 10, 2024 Denoising Image Generation
— Unverified 0CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression Oct 10, 2024 Language Modeling Language Modelling
— Unverified 0Perceptual Quality Assessment of Trisoup-Lifting Encoded 3D Point Clouds Oct 9, 2024 Point Cloud Quality Assessment Quantization
Code Code Available 0QuAILoRA: Quantization-Aware Initialization for LoRA Oct 9, 2024 Causal Language Modeling GPU
— Unverified 0Scaling Laws for Mixed quantization in Large Language Models Oct 9, 2024 Quantization
— Unverified 0JPEG Inspired Deep Learning Oct 9, 2024 Deep Learning Fine-Grained Image Classification
Code Code Available 0Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression Oct 8, 2024 Quantization regression
— Unverified 0Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-training Oct 8, 2024 Decoder Quantization
— Unverified 0Accelerating Error Correction Code Transformers Oct 8, 2024 Quantization
Code Code Available 0QERA: an Analytical Framework for Quantization Error Reconstruction Oct 8, 2024 parameter-efficient fine-tuning Quantization
— Unverified 0Variable Bitrate Residual Vector Quantization for Audio Coding Oct 8, 2024 Audio Compression Quantization
— Unverified 0