A Comprehensive Evaluation of Quantization Strategies for Large Language Models Feb 26, 2024 Language Modeling Language Modelling
Code Code Available 0Data-freeWeight Compress and Denoise for Large Language Models Feb 26, 2024 GPU Quantization
— Unverified 0Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech Feb 26, 2024 Quantization Speech Enhancement
Code Code Available 2LLM Inference Unveiled: Survey and Roofline Model Insights Feb 26, 2024 Knowledge Distillation Language Modelling
Code Code Available 4EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration Feb 25, 2024 Efficient Neural Network image-classification
Code Code Available 0Towards Accurate Post-training Quantization for Reparameterized Models Feb 25, 2024 Quantization
Code Code Available 0GPTVQ: The Blessing of Dimensionality for LLM Quantization Feb 23, 2024 CPU Quantization
— Unverified 0On the Arrow of Inference Feb 22, 2024 counterfactual Counterfactual Reasoning
— Unverified 0Text me the data: Generating Ground Pressure Sequence from Textual Descriptions for HAR Feb 22, 2024 Activity Recognition Human Activity Recognition
— Unverified 0APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models Feb 21, 2024 Quantization
— Unverified 0Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillation Feb 21, 2024 Arithmetic Reasoning GSM8K
Code Code Available 1FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing Feb 21, 2024 GPU Model Compression
— Unverified 0In-Distribution Consistency Regularization Improves the Generalization of Quantization-Aware Training Feb 21, 2024 Knowledge Distillation Quantization
— Unverified 0Understanding and Mitigating the Threat of Vec2Text to Dense Retrieval Systems Feb 20, 2024 Quantization Retrieval
Code Code Available 1Tiny Reinforcement Learning for Quadruped Locomotion using Decision Transformers Feb 20, 2024 Imitation Learning Quantization
Code Code Available 0Language-Codec: Bridging Discrete Codec Representations and Speech Language Models Feb 19, 2024 Audio Compression Audio Generation
Code Code Available 3Towards a tailored mixed-precision sub-8-bit quantization scheme for Gated Recurrent Units using Genetic Algorithms Feb 19, 2024 Model Compression Quantization
— Unverified 0WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More Feb 19, 2024 Quantization Text Generation
— Unverified 0Is It a Free Lunch for Removing Outliers during Pretraining? Feb 19, 2024 Quantization
— Unverified 0DB-LLM: Accurate Dual-Binarization for Efficient LLMs Feb 19, 2024 Binarization Computational Efficiency
— Unverified 0LaCo: Large Language Model Pruning via Layer Collapse Feb 17, 2024 Knowledge Distillation Language Modeling
Code Code Available 1Hierarchical Prior-based Super Resolution for Point Cloud Geometry Compression Feb 17, 2024 Decoder Quantization
Code Code Available 1OneBit: Towards Extremely Low-bit Large Language Models Feb 17, 2024 Quantization
Code Code Available 3One-Bit Quantization and Sparsification for Multiclass Linear Classification with Strong Regularization Feb 16, 2024 Classification Quantization
— Unverified 0QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning Feb 16, 2024 GPU Language Modeling
— Unverified 0Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Feb 16, 2024 Quantization
Code Code Available 2EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge Feb 16, 2024 Quantization
Code Code Available 1BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation Feb 16, 2024 Knowledge Distillation Quantization
Code Code Available 4PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control Feb 16, 2024 continuous-control Continuous Control
Code Code Available 1BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains Feb 15, 2024 Few-Shot Learning Medical Question Answering
Code Code Available 2Multi-Excitation Projective Simulation with a Many-Body Physics Inspired Inductive Bias Feb 15, 2024 Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
Code Code Available 0Model Compression and Efficient Inference for Large Language Models: A Survey Feb 15, 2024 Knowledge Distillation Model Compression
— Unverified 0QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference Feb 15, 2024 GPU Quantization
Code Code Available 2Quantized Embedding Vectors for Controllable Diffusion Language Models Feb 15, 2024 Language Modeling Language Modelling
— Unverified 0Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO Systems Feb 14, 2024 Quantization
Code Code Available 0Rate-Splitting Multiple Access for Quantized ISAC LEO Satellite Systems: A Max-Min Fair Energy-Efficient Beam Design Feb 14, 2024 Fairness ISAC
— Unverified 0Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers Feb 14, 2024 Quantization
— Unverified 0BdSLW60: A Word-Level Bangla Sign Language Dataset Feb 13, 2024 Benchmarking Gesture Recognition
Code Code Available 0TeMPO: Efficient Time-Multiplexed Dynamic Photonic Tensor Core for Edge AI with Compact Slow-Light Electro-Optic Modulator Feb 12, 2024 Quantization
— Unverified 0Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks Feb 11, 2024 Quantization
— Unverified 0On Leaky-Integrate-and Fire as Spike-Train-Quantization Operator on Dirac-Superimposed Continuous-Time Signals Feb 10, 2024 Quantization
— Unverified 0A Thorough Examination of Decoding Methods in the Era of LLMs Feb 10, 2024 Quantization
Code Code Available 1LiRank: Industrial Large Scale Ranking Models at LinkedIn Feb 10, 2024 Click-Through Rate Prediction Quantization
— Unverified 0RQP-SGD: Differential Private Machine Learning through Noisy SGD and Randomized Quantization Feb 9, 2024 Privacy Preserving Quantization
— Unverified 0Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings Feb 9, 2024 Machine Translation Quantization
Code Code Available 1Accurate LoRA-Finetuning Quantization of LLMs via Information Retention Feb 8, 2024 MMLU Quantization
Code Code Available 2RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization Feb 8, 2024 Quantization
— Unverified 0Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting Feb 8, 2024 Computational Efficiency Multivariate Time Series Forecasting
— Unverified 0ApiQ: Finetuning of 2-Bit Quantized Large Language Model Feb 7, 2024 GPU Language Modeling
Code Code Available 1Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training Feb 7, 2024 Combinatorial Optimization Computational Efficiency
— Unverified 0