AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers Feb 7, 2025 image-classification Image Classification
— Unverified 0A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications Feb 6, 2025 NVIDIA Jetson Orin Nano object-detection
— Unverified 0KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference Feb 6, 2025 Mathematical Reasoning Quantization
Code Code Available 0TQ-DiT: Efficient Time-Aware Quantization for Diffusion Transformers Feb 6, 2025 Computational Efficiency Quantization
— Unverified 0Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization Feb 6, 2025 Quantization
— Unverified 0Asymptotic Analysis of One-bit Quantized Box-Constrained Precoding in Large-Scale Multi-User Systems Feb 5, 2025 Quantization
— Unverified 0SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions Feb 5, 2025 Quantization Question Answering
— Unverified 0HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference Feb 5, 2025 Language Modeling Language Modelling
— Unverified 0BRIDLE: Generalized Self-supervised Learning with Quantization Feb 4, 2025 image-classification Image Classification
Code Code Available 0Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales Feb 4, 2025 Language Modeling Language Modelling
— Unverified 0Survey of Quantization Techniques for On-Device Vision-based Crack Detection Feb 4, 2025 Quantization Structural Health Monitoring
— Unverified 0Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis Feb 3, 2025 Quantization Speech Synthesis
— Unverified 0Nearly Lossless Adaptive Bit Switching Feb 3, 2025 Quantization
Code Code Available 0QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning Feb 3, 2025 Data Valuation Language Modeling
Code Code Available 0Choose Your Model Size: Any Compression by a Single Gradient Descent Feb 3, 2025 Quantization
— Unverified 0An Inquiry into Datacenter TCO for LLM Inference with FP8 Feb 3, 2025 Language Modeling Language Modelling
— Unverified 0On Noncommutative Quantum Mechanics and the Black-Scholes Model Feb 2, 2025 Quantization
— Unverified 0Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference Feb 2, 2025 Model Compression Quantization
— Unverified 0Structural Latency Perturbation in Large Language Models Through Recursive State Induction Feb 2, 2025 Computational Efficiency Quantization
— Unverified 0Enhancing Field-Oriented Control of Electric Drives with Tiny Neural Network Optimized for Micro-controllers Feb 1, 2025 Quantization
— Unverified 0MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization Feb 1, 2025 Quantization
— Unverified 0Fully Distributed and Quantized Algorithm for MPC-based Autonomous Vehicle Platooning Optimization Jan 31, 2025 Model Predictive Control Quantization
— Unverified 0LLM-based Affective Text Generation Quality Based on Different Quantization Values Jan 31, 2025 GPU Quantization
— Unverified 0CodeBrain: Impute Any Brain MRI via Instance-specific Scalar-quantized Codes Jan 30, 2025 Imputation Quantization
— Unverified 0Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models Jan 30, 2025 Graph Neural Network Quantization
— Unverified 0Distinguished Quantized Guidance for Diffusion-based Sequence Recommendation Jan 29, 2025 Denoising Quantization
— Unverified 0Optimizing Large Language Model Training Using FP4 Quantization Jan 28, 2025 Language Modeling Language Modelling
— Unverified 0Post-Training Quantization for Vision Mamba with k-Scaled Quantization and Reparameterization Jan 28, 2025 Mamba Quantization
— Unverified 0EdgeMLOps: Operationalizing ML models with Cumulocity IoT and thin-edge.io for Visual quality Inspection Jan 28, 2025 Asset Management Management
— Unverified 0Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines Jan 28, 2025 Image Segmentation Medical Image Segmentation
Code Code Available 0Stabilization of an unstable reaction-diffusion PDE with input delay despite state and input quantization Jan 27, 2025 Quantization
— Unverified 0One-Bit Sigma-Delta DFRC Waveform Design: Using Quantization Noise for Radar Probing Jan 27, 2025 Quantization
— Unverified 0SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity Jan 26, 2025 Image Generation Quantization
— Unverified 0Decentralized Low-Rank Fine-Tuning of Large Language Models Jan 26, 2025 Federated Learning parameter-efficient fine-tuning
— Unverified 0RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations Jan 25, 2025 Computational Efficiency GSM8K
— Unverified 0FBQuant: FeedBack Quantization for Large Language Models Jan 25, 2025 Quantization
— Unverified 0AKVQ-VL: Attention-Aware KV Cache Adaptive 2-Bit Quantization for Vision-Language Models Jan 25, 2025 Quantization
— Unverified 0On Accelerating Edge AI: Optimizing Resource-Constrained Environments Jan 25, 2025 Knowledge Distillation Model Compression
— Unverified 0On Hardening DNNs against Noisy Computations Jan 24, 2025 Quantization
— Unverified 0Channel-Aware Constellation Design for Digital OTA Computation Jan 24, 2025 Quantization
— Unverified 0End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml Jan 24, 2025 Quantization
— Unverified 0SwiftPrune: Hessian-Free Weight Pruning for Large Language Models Jan 24, 2025 Model Compression Quantization
— Unverified 0MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods Jan 23, 2025 Mamba Quantization
— Unverified 0Qrazor: Reliable and effortless 4-bit llm quantization by significant data razoring Jan 23, 2025 Quantization
— Unverified 0DQ-Data2vec: Decoupling Quantization for Multilingual Speech Recognition Jan 23, 2025 Quantization Representation Learning
— Unverified 0QMamba: Post-Training Quantization for Vision State Space Models Jan 23, 2025 Quantization State Space Models
— Unverified 0Diffusion-based Perceptual Neural Video Compression with Temporal Diffusion Information Reuse Jan 23, 2025 Image Compression Quantization
— Unverified 0Irrational Complex Rotations Empower Low-bit Optimizers Jan 22, 2025 GPU Quantization
— Unverified 0GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models Jan 22, 2025 GPU Quantization
Code Code Available 0HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation Jan 22, 2025 CPU GPU
— Unverified 0