Survey of Quantization Techniques for On-Device Vision-based Crack Detection Feb 4, 2025 Quantization Structural Health Monitoring
— Unverified 0Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales Feb 4, 2025 Language Modeling Language Modelling
— Unverified 0Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding Feb 3, 2025 Quantization
Code Code Available 2Choose Your Model Size: Any Compression by a Single Gradient Descent Feb 3, 2025 Quantization
— Unverified 0QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning Feb 3, 2025 Data Valuation Language Modeling
Code Code Available 0Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis Feb 3, 2025 Quantization Speech Synthesis
— Unverified 0An Inquiry into Datacenter TCO for LLM Inference with FP8 Feb 3, 2025 Language Modeling Language Modelling
— Unverified 0Nearly Lossless Adaptive Bit Switching Feb 3, 2025 Quantization
Code Code Available 0Structural Latency Perturbation in Large Language Models Through Recursive State Induction Feb 2, 2025 Computational Efficiency Quantization
— Unverified 0Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference Feb 2, 2025 Model Compression Quantization
— Unverified 0On Noncommutative Quantum Mechanics and the Black-Scholes Model Feb 2, 2025 Quantization
— Unverified 0MQuant: Unleashing the Inference Potential of Multimodal Large Language Models via Full Static Quantization Feb 1, 2025 Quantization
— Unverified 0Enhancing Field-Oriented Control of Electric Drives with Tiny Neural Network Optimized for Micro-controllers Feb 1, 2025 Quantization
— Unverified 0LLM-based Affective Text Generation Quality Based on Different Quantization Values Jan 31, 2025 GPU Quantization
— Unverified 0Visual Autoregressive Modeling for Image Super-Resolution Jan 31, 2025 Image Super-Resolution Quantization
Code Code Available 2Fully Distributed and Quantized Algorithm for MPC-based Autonomous Vehicle Platooning Optimization Jan 31, 2025 Model Predictive Control Quantization
— Unverified 0Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models Jan 31, 2025 GPU Quantization
Code Code Available 1Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models Jan 30, 2025 Graph Neural Network Quantization
— Unverified 0CodeBrain: Impute Any Brain MRI via Instance-specific Scalar-quantized Codes Jan 30, 2025 Imputation Quantization
— Unverified 0Distinguished Quantized Guidance for Diffusion-based Sequence Recommendation Jan 29, 2025 Denoising Quantization
— Unverified 0Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines Jan 28, 2025 Image Segmentation Medical Image Segmentation
Code Code Available 0Post-Training Quantization for Vision Mamba with k-Scaled Quantization and Reparameterization Jan 28, 2025 Mamba Quantization
— Unverified 0EdgeMLOps: Operationalizing ML models with Cumulocity IoT and thin-edge.io for Visual quality Inspection Jan 28, 2025 Asset Management Management
— Unverified 0Optimizing Large Language Model Training Using FP4 Quantization Jan 28, 2025 Language Modeling Language Modelling
— Unverified 0Stabilization of an unstable reaction-diffusion PDE with input delay despite state and input quantization Jan 27, 2025 Quantization
— Unverified 0One-Bit Sigma-Delta DFRC Waveform Design: Using Quantization Noise for Radar Probing Jan 27, 2025 Quantization
— Unverified 0SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity Jan 26, 2025 Image Generation Quantization
— Unverified 0Decentralized Low-Rank Fine-Tuning of Large Language Models Jan 26, 2025 Federated Learning parameter-efficient fine-tuning
— Unverified 0GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting Jan 26, 2025 Quantization
Code Code Available 2FBQuant: FeedBack Quantization for Large Language Models Jan 25, 2025 Quantization
— Unverified 0RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations Jan 25, 2025 Computational Efficiency GSM8K
— Unverified 0AKVQ-VL: Attention-Aware KV Cache Adaptive 2-Bit Quantization for Vision-Language Models Jan 25, 2025 Quantization
— Unverified 0On Accelerating Edge AI: Optimizing Resource-Constrained Environments Jan 25, 2025 Knowledge Distillation Model Compression
— Unverified 0SwiftPrune: Hessian-Free Weight Pruning for Large Language Models Jan 24, 2025 Model Compression Quantization
— Unverified 0Channel-Aware Constellation Design for Digital OTA Computation Jan 24, 2025 Quantization
— Unverified 0End-to-end workflow for machine learning-based qubit readout with QICK and hls4ml Jan 24, 2025 Quantization
— Unverified 0On Hardening DNNs against Noisy Computations Jan 24, 2025 Quantization
— Unverified 0OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting Jan 23, 2025 Language Modeling Language Modelling
Code Code Available 2Qrazor: Reliable and effortless 4-bit llm quantization by significant data razoring Jan 23, 2025 Quantization
— Unverified 0QMamba: Post-Training Quantization for Vision State Space Models Jan 23, 2025 Quantization State Space Models
— Unverified 0MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods Jan 23, 2025 Mamba Quantization
— Unverified 0Diffusion-based Perceptual Neural Video Compression with Temporal Diffusion Information Reuse Jan 23, 2025 Image Compression Quantization
— Unverified 0DQ-Data2vec: Decoupling Quantization for Multilingual Speech Recognition Jan 23, 2025 Quantization Representation Learning
— Unverified 0Quantized Spike-driven Transformer Jan 23, 2025 Quantization
Code Code Available 1HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation Jan 22, 2025 CPU GPU
— Unverified 0Irrational Complex Rotations Empower Low-bit Optimizers Jan 22, 2025 GPU Quantization
— Unverified 0Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes Jan 22, 2025 3DGS Quantization
— Unverified 0GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models Jan 22, 2025 GPU Quantization
Code Code Available 0SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization Jan 21, 2025 Quantization
— Unverified 0HAC++: Towards 100X Compression of 3D Gaussian Splatting Jan 21, 2025 3DGS Attribute
Code Code Available 3