Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training Feb 7, 2024 Combinatorial Optimization Computational Efficiency
— Unverified 0Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap Feb 6, 2024 Domain Generalization Quantization
Code Code Available 0QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks Feb 6, 2024 Quantization
Code Code Available 4BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Feb 6, 2024 Binarization GPU
Code Code Available 3QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning Feb 6, 2024 Image Generation Model Compression
Code Code Available 2Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes Feb 6, 2024 Federated Learning Model Compression
— Unverified 0A Survey on Transformer Compression Feb 5, 2024 Knowledge Distillation Mamba
— Unverified 0Quantized Approximately Orthogonal Recurrent Neural Networks Feb 5, 2024 Quantization Time Series
— Unverified 0Optimal and Near-Optimal Adaptive Vector Quantization Feb 5, 2024 Quantization
— Unverified 0KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache Feb 5, 2024 Quantization
Code Code Available 3FoldToken: Learning Protein Language via Vector Quantization and Beyond Feb 4, 2024 Quantization
— Unverified 0Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network Feb 4, 2024 Quantization
— Unverified 0LQER: Low-Rank Quantization Error Reconstruction for LLMs Feb 4, 2024 Knowledge Distillation Quantization
Code Code Available 1Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy Environments Feb 4, 2024 Quantization
Code Code Available 0Locally-Adaptive Quantization for Streaming Vector Search Feb 3, 2024 Quantization Retrieval
— Unverified 0Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning Feb 2, 2024 Quantization
— Unverified 0Ultrafast jet classification on FPGAs for the HL-LHC Feb 2, 2024 Quantization
Code Code Available 0Large Language Models for Time Series: A Survey Feb 2, 2024 Quantization Survey
Code Code Available 4Neural Language of Thought Models Feb 2, 2024 Image Generation Object
— Unverified 0SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding Feb 2, 2024 Adversarial Attack Quantization
Code Code Available 0Faster Inference of Integer SWIN Transformer by Removing the GELU Activation Feb 2, 2024 GPU image-classification
— Unverified 0FedShift: Tackling Dual Heterogeneity Problem of Federated Learning via Weight Shift Aggregation Feb 2, 2024 Diversity Federated Learning
— Unverified 0An Intra-BRNN and GB-RVQ Based END-TO-END Neural Audio Codec Feb 2, 2024 Quantization
— Unverified 0HW-SW Optimization of DNNs for Privacy-preserving People Counting on Low-resolution Infrared Arrays Feb 2, 2024 Neural Architecture Search Privacy Preserving
— Unverified 0Truncated Non-Uniform Quantization for Distributed SGD Feb 2, 2024 Quantization
— Unverified 0Can Large Language Models Understand Context? Feb 1, 2024 In-Context Learning Quantization
— Unverified 0Analog-digital Scheduling for Federated Learning: A Communication-Efficient Approach Feb 1, 2024 Federated Learning Quantization
— Unverified 0KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization Jan 31, 2024 GPU Quantization
Code Code Available 3Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs Jan 31, 2024 Deep Learning Quantization
— Unverified 0Effect of Weight Quantization on Learning Models by Typical Case Analysis Jan 30, 2024 Quantization
— Unverified 0One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training Jan 30, 2024 Quantization
Code Code Available 0HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference Jan 29, 2024 Quantization
— Unverified 0Effective Communication with Dynamic Feature Compression Jan 29, 2024 Deep Reinforcement Learning Feature Compression
Code Code Available 0Scaling Sparse Fine-Tuning to Large Language Models Jan 29, 2024 parameter-efficient fine-tuning Quantization
Code Code Available 1LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection Jan 29, 2024 3D Object Detection Autonomous Vehicles
Code Code Available 2Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval Jan 27, 2024 Contrastive Learning Image Retrieval
— Unverified 0A Comprehensive Survey of Compression Algorithms for Language Models Jan 27, 2024 Knowledge Distillation Quantization
— Unverified 0Residual Quantization with Implicit Neural Codebooks Jan 26, 2024 Data Compression Quantization
Code Code Available 2MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer Jan 26, 2024 Quantization
— Unverified 0LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization Jan 26, 2024 Quantization
— Unverified 0Within-basket Recommendation via Neural Pattern Associator Jan 25, 2024 Quantization
— Unverified 0FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Jan 25, 2024 GPU Quantization
Code Code Available 3CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks Jan 25, 2024 Model Compression Quantization
— Unverified 0Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators Jan 25, 2024 Quantization
— Unverified 0Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers Jan 24, 2024 Quantization
— Unverified 0Iterated Relevance Matrix Analysis (IRMA) for the identification of class-discriminative subspaces Jan 23, 2024 Dimensionality Reduction Quantization
— Unverified 0Scaling Up Quantization-Aware Neural Architecture Search for Efficient Deep Learning on the Edge Jan 22, 2024 Neural Architecture Search Quantization
— Unverified 0Robustness to distribution shifts of compressed networks for edge devices Jan 22, 2024 Knowledge Distillation Quantization
— Unverified 0Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding Jan 21, 2024 Clustering Image Compression
— Unverified 0Edge-Enabled Real-time Railway Track Segmentation Jan 21, 2024 GPU Quantization
— Unverified 0