Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation Jun 7, 2025 MedQA Quantization
— Unverified 0Towards AI-Native Fronthaul: Neural Compression for NextG Cloud RAN Jun 7, 2025 Quantization
— Unverified 0Bridging the Modality Gap: Softly Discretizing Audio Representation for LLM-based Automatic Speech Recognition Jun 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model Jun 6, 2025 Natural Language Understanding Quantization
Code Code Available 0BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning Jun 6, 2025 continuous-control Continuous Control
— Unverified 0TaDA: Training-free recipe for Decoding with Adaptive KV Cache Compression and Mean-centering Jun 5, 2025 Quantization
— Unverified 0FPTQuant: Function-Preserving Transforms for LLM Quantization Jun 5, 2025 Quantization
— Unverified 0FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion Jun 5, 2025 Denoising Quantization
— Unverified 0Massive MIMO with 1-Bit DACs: Data Detection for Quantized Linear Precoding with Dithering Jun 5, 2025 Quantization
— Unverified 0PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling Jun 5, 2025 Clustering Quantization
— Unverified 0Kernel k-Medoids as General Vector Quantization Jun 5, 2025 Data Compression Density Estimation
— Unverified 0Nonlinear Sparse Bayesian Learning Methods with Application to Massive MIMO Channel Estimation with Hardware Impairments Jun 4, 2025 Quantization
— Unverified 0BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing Jun 4, 2025 Quantization text-to-speech
— Unverified 0STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization Jun 4, 2025 Action Generation Quantization
Code Code Available 0Quantized Dissipative Uncertain Model for Fractional T_S Fuzzy systems with Time_Varying Delays Under Networked Control System Jun 3, 2025 Quantization
— Unverified 0Enhancing Convergence, Privacy and Fairness for Wireless Personalized Federated Learning: Quantization-Assisted Min-Max Fair Scheduling Jun 3, 2025 Fairness Federated Learning
— Unverified 0MUC-G4: Minimal Unsat Core-Guided Incremental Verification for Deep Neural Network Compression Jun 3, 2025 Neural Network Compression Quantization
— Unverified 0Parameter Efficient Fine Tuning Llama 3.1 for Answering Arabic Legal Questions: A Case Study on Jordanian Laws Jun 2, 2025 Language Modeling Language Modelling
Code Code Available 0Flexible Mixed Precision Quantization for Learned Image Compression Jun 2, 2025 Image Compression Quantization
Code Code Available 0Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs Jun 2, 2025 Quantization
— Unverified 0Structured Pruning and Quantization for Learned Image Compression Jun 2, 2025 image-classification Image Classification
Code Code Available 0Enhancing Speech Emotion Recognition with Graph-Based Multimodal Fusion and Prosodic Features for the Speech Emotion Recognition in Naturalistic Conditions Challenge at Interspeech 2025 Jun 2, 2025 Audio Tagging Emotion Recognition
— Unverified 0CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer Jun 1, 2025 Audio captioning Language Modeling
— Unverified 0Quantization-based Bounds on the Wasserstein Metric Jun 1, 2025 Computational Efficiency Domain Adaptation
— Unverified 0Power-of-Two (PoT) Weights in Large Language Models (LLMs) May 31, 2025 Quantization
— Unverified 0LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text May 30, 2025 Quantization
Code Code Available 0Edge Computing for Physics-Driven AI in Computational MRI: A Feasibility Study May 30, 2025 Computational Efficiency Edge-computing
— Unverified 0Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach May 30, 2025 Automatic Speech Recognition Quantization
— Unverified 0LittleBit: Ultra Low-Bit Quantization via Latent Factorization May 30, 2025 Quantization
— Unverified 0MuLoCo: Muon is a practical inner optimizer for DiLoCo May 29, 2025 Decoder Quantization
— Unverified 0Efficient Quantum Approximate kNN Algorithm via Granular-Ball Computing May 29, 2025 Quantization
— Unverified 0Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation May 29, 2025 Domain Adaptation Multi-target Domain Adaptation
Code Code Available 0Revisiting Uncertainty Estimation and Calibration of Large Language Models May 29, 2025 Mixture-of-Experts MMLU
— Unverified 0Highly Efficient and Effective LLMs with Multi-Boolean Architectures May 28, 2025 Binarization Quantization
— Unverified 0Climate Finance Bench May 28, 2025 Logical Reasoning Quantization
Code Code Available 0On the Interplay of Privacy, Persuasion and Quantization May 28, 2025 Decision Making Decoder
— Unverified 0Does quantization affect models' performance on long-context tasks? May 26, 2025 Quantization
Code Code Available 0Small Language Models: Architectures, Techniques, Evaluation, Problems and Future Adaptation May 26, 2025 Model Compression Quantization
— Unverified 0LPCM: Learning-based Predictive Coding for LiDAR Point Cloud Compression May 26, 2025 Quantization
— Unverified 0CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge May 26, 2025 Activity Recognition Quantization
— Unverified 0BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics May 26, 2025 Brain Computer Interface Disentanglement
— Unverified 0Optimizing edge AI models on HPC systems with the edge in the loop May 26, 2025 Hardware Aware Neural Architecture Search Knowledge Distillation
Code Code Available 0Efficient Speech Translation through Model Compression and Knowledge Distillation May 26, 2025 Knowledge Distillation Model Compression
Code Code Available 0Communication-Efficient Multi-Device Inference Acceleration for Transformer Models May 25, 2025 Quantization
Code Code Available 0FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization May 25, 2025 Computational Efficiency CPU
— Unverified 0LoTA-QAF: Lossless Ternary Adaptation for Quantization-Aware Fine-Tuning May 24, 2025 Computational Efficiency MMLU
Code Code Available 0Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees May 24, 2025 Quantization
Code Code Available 0Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation May 24, 2025 Quantization Representation Learning
— Unverified 0Efficient and Workload-Aware LLM Serving via Runtime Layer Swapping and KV Cache Resizing May 24, 2025 Model Compression Quantization
— Unverified 0Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding May 24, 2025 Quantization
Code Code Available 0