RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy Dec 2, 2024 Computational Efficiency Language Modeling
Code Code Available 0Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control Dec 2, 2024 Autonomous Driving Decision Making
— Unverified 0Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification Dec 2, 2024 GPU Quantization
— Unverified 0Reducing Inference Energy Consumption Using Dual Complementary CNNs Dec 2, 2024 Quantization
Code Code Available 0Optimizing Domain-Specific Image Retrieval: A Benchmark of FAISS and Annoy with Fine-Tuned Features Dec 2, 2024 Image Retrieval Quantization
— Unverified 0A Wave is Worth 100 Words: Investigating Cross-Domain Transferability in Time Series Dec 1, 2024 Imputation Quantization
— Unverified 0LAMBDA: Covering the Multimodal Critical Scenarios for Automated Driving Systems by Search Space Quantization Nov 30, 2024 Quantization
— Unverified 0CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation Nov 29, 2024 Quantization Vision-Language-Action
— Unverified 0Quantized Delta Weight Is Safety Keeper Nov 29, 2024 Quantization
— Unverified 0Privacy-Preserving Orthogonal Aggregation for Guaranteeing Gender Fairness in Federated Recommendation Nov 29, 2024 Attribute Fairness
— Unverified 0DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding Nov 29, 2024 Motion Synthesis Quantization
— Unverified 0Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads Nov 28, 2024 GPU Language Modeling
— Unverified 0On the effectiveness of discrete representations in sparse mixture of experts Nov 28, 2024 Mixture-of-Experts Quantization
— Unverified 0FAMES: Fast Approximate Multiplier Substitution for Mixed-Precision Quantized DNNs--Down to 2 Bits! Nov 27, 2024 Quantization
— Unverified 0SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors Nov 26, 2024 Quantization Sensitivity
— Unverified 0COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection Nov 26, 2024 Quantization
— Unverified 0Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving Nov 26, 2024 Autonomous Driving Quantization
— Unverified 0LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization Nov 26, 2024 Image Generation Quantization
Code Code Available 0Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens Nov 26, 2024 Quantization
— Unverified 0Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression Nov 25, 2024 Computational Efficiency Image Compression
— Unverified 0Factorized Visual Tokenization and Generation Nov 25, 2024 Disentanglement Image Generation
— Unverified 0Representation Collapsing Problems in Vector Quantization Nov 25, 2024 Quantization
— Unverified 0MixPE: Quantization and Hardware Co-design for Efficient LLM Inference Nov 25, 2024 Quantization
— Unverified 0SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations Nov 25, 2024 Quantization Self-Supervised Learning
— Unverified 0Lion Cub: Minimizing Communication Overhead in Distributed Lion Nov 25, 2024 Quantization
— Unverified 0Downlink MIMO Channel Estimation from Bits: Recoverability and Algorithm Nov 25, 2024 Quantization
— Unverified 0Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics Nov 25, 2024 Knowledge Distillation Multi-Task Learning
— Unverified 0Rethinking Diffusion for Text-Driven Human Motion Generation Nov 25, 2024 Motion Generation Quantization
— Unverified 0Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape Nov 25, 2024 Quantization
— Unverified 0freePruner: A Training-free Approach for Large Multimodal Model Acceleration Nov 23, 2024 Quantization Question Answering
— Unverified 0Efficient Online Inference of Vision Transformers by Training-Free Tokenization Nov 23, 2024 Quantization
Code Code Available 0FLARE: FP-Less PTQ and Low-ENOB ADC Based AMS-PiM for Error-Resilient, Fast, and Efficient Transformer Acceleration Nov 22, 2024 Quantization
— Unverified 0TaQ-DiT: Time-aware Quantization for Diffusion Transformers Nov 21, 2024 Denoising Model Compression
— Unverified 0AutoMixQ: Self-Adjusting Quantization for High Performance Memory-Efficient Fine-Tuning Nov 21, 2024 Quantization
— Unverified 0RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content Nov 20, 2024 4k Knowledge Distillation
— Unverified 0Disco Intelligent Omni-Surfaces: 360-degree Fully-Passive Jamming Attacks Nov 20, 2024 Quantization
— Unverified 0High-Throughput Blind Co-Channel Interference Cancellation for Edge Devices Using Depthwise Separable Convolutions, Quantization, and Pruning Nov 19, 2024 blind source separation Computational Efficiency
— Unverified 0Diffusion Product Quantization Nov 19, 2024 Quantization
— Unverified 0BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration Nov 18, 2024 Quantization
Code Code Available 0EfQAT: An Efficient Framework for Quantization-Aware Training Nov 17, 2024 Quantization
— Unverified 0Towards Accurate and Efficient Sub-8-Bit Integer Training Nov 17, 2024 CPU GPU
— Unverified 0BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Nov 16, 2024 Quantization
— Unverified 0An exploration of the effect of quantisation on energy consumption and inference time of StarCoder2 Nov 15, 2024 Model Compression Quantization
Code Code Available 0Systolic Arrays and Structured Pruning Co-design for Efficient Transformers in Edge Systems Nov 15, 2024 Machine Translation Quantization
— Unverified 0AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference Nov 15, 2024 Quantization Question Answering
— Unverified 0Communication Compression for Tensor Parallel LLM Inference Nov 14, 2024 Quantization
— Unverified 0ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization Nov 12, 2024 Language Modeling Language Modelling
— Unverified 0Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning Nov 12, 2024 Imitation Learning Offline RL
— Unverified 0Towards Low-bit Communication for Tensor Parallel LLM Inference Nov 12, 2024 Language Modeling Language Modelling
— Unverified 0HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment Nov 11, 2024 Quantization
— Unverified 0