EfficientLLM: Efficiency in Large Language Models May 20, 2025 Mixture-of-Experts Quantization
— Unverified 0Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability May 20, 2025 counterfactual Memorization
— Unverified 0Scaling Law for Quantization-Aware Training May 20, 2025 Quantization
Code Code Available 4Dual Precision Quantization for Efficient and Accurate Deep Neural Networks Inference May 20, 2025 Quantization speech-recognition
Code Code Available 0Layer-wise Quantization for Quantized Optimistic Dual Averaging May 20, 2025 Quantization
— Unverified 0Optimizing Binary and Ternary Neural Network Inference on RRAM Crossbars using CIM-Explorer May 20, 2025 Quantization
Code Code Available 1QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding May 19, 2025 Quantization Spoken Language Understanding
Code Code Available 0An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware May 19, 2025 Quantization
Code Code Available 0GANCompress: GAN-Enhanced Neural Image Compression with Binary Spherical Quantization May 19, 2025 Computational Efficiency Image Compression
— Unverified 0Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space May 19, 2025 Language Modeling Language Modelling
Code Code Available 2Deep Unfolding with Kernel-based Quantization in MIMO Detection May 19, 2025 Density Estimation Edge-computing
— Unverified 0UniHM: Universal Human Motion Generation with Object Interactions in Indoor Scenes May 19, 2025 Human-Object Interaction Detection Motion Generation
— Unverified 0Fine-tuning Quantized Neural Networks with Zeroth-order Optimization May 19, 2025 GPU Quantization
Code Code Available 1Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs May 19, 2025 Quantization Sensitivity
— Unverified 0A3 : an Analytical Low-Rank Approximation Framework for Attention May 19, 2025 Quantization
— Unverified 0KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache May 18, 2025 Quantization
— Unverified 0CALM: Co-evolution of Algorithms and Language Model for Automatic Heuristic Design May 18, 2025 GPU Language Modeling
— Unverified 0PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement May 18, 2025 Quantization Video Enhancement
Code Code Available 0Hyperbolic Residual Quantization: Discrete Representations for Data with Latent Hierarchies May 18, 2025 Inductive Bias Knowledge Graphs
— Unverified 0FedHQ: Hybrid Runtime Quantization for Federated Learning May 17, 2025 Federated Learning Quantization
— Unverified 0Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization May 16, 2025 Quantization Text Generation
— Unverified 0Benchmarking CFAR and CNN-based Peak Detection Algorithms in ISAC under Hardware Impairments May 16, 2025 Benchmarking Integrated sensing and communication
— Unverified 0Formal Uncertainty Propagation for Stochastic Dynamical Systems with Additive Noise May 16, 2025 Quantization Stochastic Optimization
— Unverified 0QVGen: Pushing the Limit of Quantized Video Generative Models May 16, 2025 Quantization
— Unverified 0MARRS: Masked Autoregressive Unit-based Reaction Synthesis May 16, 2025 Motion Generation Quantization
— Unverified 0Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training May 16, 2025 GPU Quantization
— Unverified 0Addition is almost all you need: Compressing neural networks with double binary factorization May 16, 2025 All Binarization
Code Code Available 0GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models May 16, 2025 Adversarial Attack Adversarial Defense
Code Code Available 1Accurate KV Cache Quantization with Outlier Tokens Tracing May 16, 2025 Quantization
Code Code Available 1EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes May 16, 2025 3DGS NeRF
Code Code Available 1A probabilistic framework for dynamic quantization May 15, 2025 Quantization
— Unverified 0VQ-Logits: Compressing the Output Bottleneck of Large Language Models via Vector Quantized Logits May 15, 2025 Language Modeling Language Modelling
— Unverified 0TransPL: VQ-Code Transition Matrices for Pseudo-Labeling of Time Series Unsupervised Domain Adaptation May 15, 2025 Domain Adaptation Pseudo Label
Code Code Available 0Analog Foundation Models May 14, 2025 4k Quantization
Code Code Available 1Zero-shot Quantization: A Comprehensive Survey May 14, 2025 Quantization Survey
— Unverified 0Efficient Mixed Precision Quantization in Graph Neural Networks May 14, 2025 Graph Classification Node Classification
Code Code Available 0Resource-Efficient Language Models: Quantization for Fast and Accessible Inference May 13, 2025 Quantization
— Unverified 0Multi-Layer Hierarchical Federated Learning with Quantization May 13, 2025 Federated Learning Quantization
— Unverified 0Efficient ANN-SNN Conversion with Error Compensation Learning May 12, 2025 Quantization
— Unverified 0Cognitive Non-Coherent Jamming Techniques for Frequency Selective Attacks May 12, 2025 Quantization
— Unverified 0An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits May 12, 2025 All Knowledge Distillation
— Unverified 0QuantX: A Framework for Hardware-Aware Quantization of Generative AI Workloads May 12, 2025 Quantization
— Unverified 0Continuous Visual Autoregressive Generation via Score Maximization May 12, 2025 Quantization
Code Code Available 1Bang for the Buck: Vector Search on Cloud CPUs May 12, 2025 CPU Quantization
— Unverified 0Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption May 12, 2025 GPU Knowledge Base Question Answering
— Unverified 0Semantic Retention and Extreme Compression in LLMs: Can We Have Both? May 12, 2025 Language Modeling Language Modelling
— Unverified 0GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance May 11, 2025 Language Modeling Language Modelling
Code Code Available 2Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations May 10, 2025 Language Modeling Language Modelling
— Unverified 0Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference May 9, 2025 CPU GPU
— Unverified 0LightNobel: Improving Sequence Length Limitation in Protein Structure Prediction Model via Adaptive Activation Quantization May 9, 2025 Protein Folding Protein Structure Prediction
— Unverified 0