SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors Nov 26, 2024 Quantization Sensitivity
— Unverified 0MotionLLaMA: A Unified Framework for Motion Synthesis and Comprehension Nov 26, 2024 Language Modeling Language Modelling
Code Code Available 2LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization Nov 26, 2024 Image Generation Quantization
Code Code Available 0Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape Nov 25, 2024 Quantization
— Unverified 0Rethinking Diffusion for Text-Driven Human Motion Generation Nov 25, 2024 Motion Generation Quantization
— Unverified 0Lion Cub: Minimizing Communication Overhead in Distributed Lion Nov 25, 2024 Quantization
— Unverified 0Downlink MIMO Channel Estimation from Bits: Recoverability and Algorithm Nov 25, 2024 Quantization
— Unverified 0Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics Nov 25, 2024 Knowledge Distillation Multi-Task Learning
— Unverified 0Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression Nov 25, 2024 Computational Efficiency Image Compression
— Unverified 0Factorized Visual Tokenization and Generation Nov 25, 2024 Disentanglement Image Generation
— Unverified 0MixPE: Quantization and Hardware Co-design for Efficient LLM Inference Nov 25, 2024 Quantization
— Unverified 0SKQVC: One-Shot Voice Conversion by K-Means Quantization with Self-Supervised Speech Representations Nov 25, 2024 Quantization Self-Supervised Learning
— Unverified 0Representation Collapsing Problems in Vector Quantization Nov 25, 2024 Quantization
— Unverified 0Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency Nov 25, 2024 Quantization Video Restoration
Code Code Available 2Efficient Online Inference of Vision Transformers by Training-Free Tokenization Nov 23, 2024 Quantization
Code Code Available 0freePruner: A Training-free Approach for Large Multimodal Model Acceleration Nov 23, 2024 Quantization Question Answering
— Unverified 0FLARE: FP-Less PTQ and Low-ENOB ADC Based AMS-PiM for Error-Resilient, Fast, and Efficient Transformer Acceleration Nov 22, 2024 Quantization
— Unverified 0TaQ-DiT: Time-aware Quantization for Diffusion Transformers Nov 21, 2024 Denoising Model Compression
— Unverified 0Quantization without Tears Nov 21, 2024 GPU Quantization
Code Code Available 1AutoMixQ: Self-Adjusting Quantization for High Performance Memory-Efficient Fine-Tuning Nov 21, 2024 Quantization
— Unverified 0Quantized symbolic time series approximation Nov 20, 2024 Anomaly Detection Astronomy
Code Code Available 2Disco Intelligent Omni-Surfaces: 360-degree Fully-Passive Jamming Attacks Nov 20, 2024 Quantization
— Unverified 0RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content Nov 20, 2024 4k Knowledge Distillation
— Unverified 0High-Throughput Blind Co-Channel Interference Cancellation for Edge Devices Using Depthwise Separable Convolutions, Quantization, and Pruning Nov 19, 2024 blind source separation Computational Efficiency
— Unverified 0SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search Nov 19, 2024 Quantization Re-Ranking
Code Code Available 2Diffusion Product Quantization Nov 19, 2024 Quantization
— Unverified 0BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration Nov 18, 2024 Quantization
Code Code Available 0Towards Accurate and Efficient Sub-8-Bit Integer Training Nov 17, 2024 CPU GPU
— Unverified 0EfQAT: An Efficient Framework for Quantization-Aware Training Nov 17, 2024 Quantization
— Unverified 0SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization Nov 17, 2024 Image Generation Quantization
Code Code Available 7BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Nov 16, 2024 Quantization
— Unverified 0An exploration of the effect of quantisation on energy consumption and inference time of StarCoder2 Nov 15, 2024 Model Compression Quantization
Code Code Available 0Systolic Arrays and Structured Pruning Co-design for Efficient Transformers in Edge Systems Nov 15, 2024 Machine Translation Quantization
— Unverified 0AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference Nov 15, 2024 Quantization Question Answering
— Unverified 0Communication Compression for Tensor Parallel LLM Inference Nov 14, 2024 Quantization
— Unverified 0ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization Nov 12, 2024 Language Modeling Language Modelling
— Unverified 0Towards Low-bit Communication for Tensor Parallel LLM Inference Nov 12, 2024 Language Modeling Language Modelling
— Unverified 0Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning Nov 12, 2024 Imitation Learning Offline RL
— Unverified 0HarmLevelBench: Evaluating Harm-Level Compliance and the Impact of Quantization on Model Alignment Nov 11, 2024 Quantization
— Unverified 0Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis Nov 11, 2024 Deep Learning Federated Learning
— Unverified 0The Super Weight in Large Language Models Nov 11, 2024 Language Modeling Language Modelling
Code Code Available 2HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization Nov 10, 2024 Quantization text-classification
— Unverified 0Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques Nov 9, 2024 Quantization
— Unverified 0Intelligent Fault Diagnosis of Type and Severity in Low-Frequency, Low Bit-Depth Signals Nov 9, 2024 Fault Diagnosis Quantization
— Unverified 0An asymmetric heuristic for trained ternary quantization based on the statistics of the weights: an application to medical signal classification Nov 9, 2024 Quantization
Code Code Available 0Expansion Quantization Network: An Efficient Micro-emotion Annotation and Detection Framework Nov 9, 2024 Emotion Detection and Classification Quantization
Code Code Available 0When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization Nov 8, 2024 Decoder Quantization
— Unverified 0Rate-aware Compression for NeRF-based Volumetric Video Nov 8, 2024 NeRF Quantization
— Unverified 0Qwen2.5-32B: Leveraging Self-Consistent Tool-Integrated Reasoning for Bengali Mathematical Olympiad Problem Solving Nov 8, 2024 Prompt Engineering Quantization
— Unverified 0Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models Nov 8, 2024 Quantization Question Answering
— Unverified 0