Sliding DFT-based Signal Recovery for Modulo ADC with 1-bit Folding Information Oct 24, 2024 Quantization
— Unverified 0A Counterexample in Cross-Correlation Template Matching Oct 24, 2024 Image Registration Quantization
— Unverified 0LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search Oct 24, 2024 Clustering GPU
Code Code Available 2The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI Oct 24, 2024 Quantization
— Unverified 0Adaptive Wireless Image Semantic Transmission: Design, Simulation, and Prototype Validation Oct 23, 2024 Image Reconstruction Quantization
— Unverified 0Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ? Oct 22, 2024 Machine Translation Quantization
— Unverified 0Self-calibration for Language Model Quantization and Pruning Oct 22, 2024 Language Modeling Language Modelling
— Unverified 0Pyramid Vector Quantization for LLMs Oct 22, 2024 Quantization
— Unverified 0Catastrophic Failure of LLM Unlearning via Quantization Oct 21, 2024 Machine Unlearning Quantization
Code Code Available 1Continuous Speech Synthesis using per-token Latent Diffusion Oct 21, 2024 Image Generation Quantization
— Unverified 0Residual vector quantization for KV cache compression in large language model Oct 21, 2024 Audio Compression Language Modeling
Code Code Available 1Large Deviation Upper Bounds and Improved MSE Rates of Nonlinear SGD: Heavy-tailed Noise and Power of Symmetry Oct 21, 2024 Quantization
— Unverified 0Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces Oct 21, 2024 Continual Learning Lifelong learning
— Unverified 0LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec Oct 21, 2024 Disentanglement Language Modeling
— Unverified 0SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training Oct 20, 2024 Quantization
— Unverified 0Lossless KV Cache Compression to 2% Oct 20, 2024 Dimensionality Reduction Quantization
— Unverified 0Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language Benchmarks Oct 18, 2024 Code Generation GPU
Code Code Available 0SNAC: Multi-Scale Neural Audio Codec Oct 18, 2024 Audio Compression Audio Generation
Code Code Available 4Understanding the Difficulty of Low-Precision Post-Training Quantization for LLMs Oct 18, 2024 Quantization
— Unverified 0EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search Oct 18, 2024 Model Compression Quantization
Code Code Available 1Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching Oct 17, 2024 GPU Quantization
— Unverified 0Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees Oct 17, 2024 Quantization
— Unverified 0DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech Oct 17, 2024 Disentanglement Quantization
— Unverified 0AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations Oct 17, 2024 Decoder Quantization
— Unverified 0SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Oct 17, 2024 Quantization
Code Code Available 2Quamba: A Post-Training Quantization Recipe for Selective State Space Models Oct 17, 2024 Computational Efficiency Mamba
Code Code Available 2Optimal Quantization for Matrix Multiplication Oct 17, 2024 Quantization
Code Code Available 0Progressive Mixed-Precision Decoding for Efficient LLM Inference Oct 17, 2024 Quantization
— Unverified 0DPLM-2: A Multimodal Diffusion Protein Language Model Oct 17, 2024 Language Modeling Language Modelling
Code Code Available 3Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs Oct 17, 2024 Quantization
Code Code Available 1Learning Graph Quantized Tokenizers Oct 17, 2024 Graph Learning Quantization
Code Code Available 1A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models Oct 17, 2024 Quantization
— Unverified 0ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs Oct 16, 2024 Diversity Online Clustering
— Unverified 0Channel-Wise Mixed-Precision Quantization for Large Language Models Oct 16, 2024 Quantization
— Unverified 0COMET: Towards Partical W4A4KV4 LLMs Serving Oct 16, 2024 Quantization Scheduling
— Unverified 0DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs Oct 16, 2024 Quantization
Code Code Available 0FairGLVQ: Fairness in Partition-Based Classification Oct 16, 2024 Classification Fairness
Code Code Available 0Scaling Laws for Post Training Quantized Large Language Models Oct 15, 2024 Quantization
— Unverified 0Efficiera Residual Networks: Hardware-Friendly Fully Binary Weight with 2-bit Activation Model Achieves Practical ImageNet Accuracy Oct 15, 2024 Binarization Classification with Binary Weight Network
Code Code Available 0Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks Oct 15, 2024 Quantization
Code Code Available 1Latent Action Pretraining from Videos Oct 15, 2024 Quantization Robot Manipulation
Code Code Available 3QSpec: Speculative Decoding with Complementary Quantization Schemes Oct 15, 2024 Quantization
— Unverified 0Real-Time Stress Detection via Photoplethysmogram Signals: Implementation of a Combined Continuous Wavelet Transform and Convolutional Neural Network on Resource-Constrained Microcontrollers Oct 14, 2024 Quantization
— Unverified 0DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Oct 14, 2024 GPU Quantization
Code Code Available 4SLaNC: Static LayerNorm Calibration Oct 14, 2024 Quantization
— Unverified 0Gaussian Mixture Vector Quantization with Aggregated Categorical Posterior Oct 14, 2024 Quantization
— Unverified 0When Attention Sink Emerges in Language Models: An Empirical View Oct 14, 2024 Quantization
Code Code Available 2Gradient-Free Neural Network Training on the Edge Oct 13, 2024 Quantization
— Unverified 0GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation Oct 13, 2024 3D Generation Quantization
— Unverified 0FlatQuant: Flatness Matters for LLM Quantization Oct 12, 2024 Quantization
Code Code Available 3