BitNet b1.58 Reloaded: State-of-the-art Performance Also on Smaller Networks Jun 24, 2024 Quantization
— Unverified 0Received Power Maximization Using Nonuniform Discrete Phase Shifts for RISs With a Limited Phase Range Jun 23, 2024 2k Quantization
— Unverified 0Towards Real-Time Neural Volumetric Rendering on Mobile Devices: A Measurement Study Jun 23, 2024 NeRF Quantization
— Unverified 0HLQ: Fast and Efficient Backpropagation via Hadamard Low-rank Quantization Jun 21, 2024 Quantization
— Unverified 0Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE Jun 20, 2024 Quantization
— Unverified 0FLoCoRA: Federated learning compression with low-rank adaptation Jun 20, 2024 Federated Learning Model Compression
Code Code Available 0xCOMET-lite: Bridging the Gap Between Efficiency and Quality in Learned MT Evaluation Metrics Jun 20, 2024 Machine Translation Quantization
Code Code Available 0SDQ: Sparse Decomposed Quantization for LLM Inference Jun 19, 2024 Model Compression Quantization
— Unverified 0Q-SNNs: Quantized Spiking Neural Networks Jun 19, 2024 Quantization
— Unverified 0High-Fidelity Facial Albedo Estimation via Texture Quantization Jun 19, 2024 3D Face Reconstruction Face Reconstruction
— Unverified 0Attention-aware Post-training Quantization without Backpropagation Jun 19, 2024 Quantization
— Unverified 0MSE Minimization in RIS-Aided MU-MIMO with Discrete Phase Shifts and Fronthaul Quantization Jun 18, 2024 Quantization
— Unverified 0Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates Jun 18, 2024 parameter-efficient fine-tuning Quantization
— Unverified 0Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization Jun 17, 2024 Language Modeling Language Modelling
— Unverified 0Deep-Learning-Based Channel Estimation for Distributed MIMO with 1-bit Radio-Over-Fiber Fronthaul Jun 17, 2024 Quantization
— Unverified 0Promoting Data and Model Privacy in Federated Learning through Quantized LoRA Jun 16, 2024 Federated Learning parameter-efficient fine-tuning
— Unverified 0Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization Jun 16, 2024 Quantization Tensor Decomposition
— Unverified 0An Analysis on Quantizing Diffusion Transformers Jun 16, 2024 Conditional Image Generation Denoising
— Unverified 0Optimization of Armv9 architecture general large language model inference performance based on Llama.cpp Jun 16, 2024 Compiler Optimization Language Modeling
Code Code Available 0How Should We Extract Discrete Audio Tokens from Self-Supervised Models? Jun 15, 2024 Quantization Self-Supervised Learning
— Unverified 0Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training Jun 15, 2024 Quantization
— Unverified 0GEB-1.3B: Open Lightweight Large Language Model Jun 14, 2024 CPU Language Modeling
— Unverified 0Optimizing Byte-level Representation for End-to-end ASR Jun 14, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model Jun 14, 2024 All Quantization
— Unverified 0Precipitation Nowcasting Using Physics Informed Discriminator Generative Models Jun 14, 2024 Generative Adversarial Network Quantization
— Unverified 0Human-level molecular optimization driven by mol-gene evolution Jun 13, 2024 Drug Discovery Quantization
— Unverified 0MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction Jun 13, 2024 Quantization
— Unverified 0ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis Jun 13, 2024 Quantization Speech Synthesis
— Unverified 0ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models Jun 13, 2024 Code Generation domain classification
— Unverified 0Q-S5: Towards Quantized State Space Models Jun 13, 2024 Computational Efficiency Quantization
Code Code Available 0Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization Jun 12, 2024 Computational Efficiency Pose Estimation
— Unverified 0Compressive Beam Alignment for Indoor Millimeter-Wave Systems Jun 12, 2024 compressed sensing Quantization
— Unverified 0MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Jun 12, 2024 Benchmarking Model Compression
— Unverified 0VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment Jun 12, 2024 Quantization Speech Synthesis
— Unverified 0FoldToken2: Learning compact, invariant and generative protein structure language Jun 11, 2024 Decoder Quantization
— Unverified 0T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text Jun 11, 2024 Quantization Sign Language Production
— Unverified 0TernaryLLM: Ternarized Large Language Model Jun 11, 2024 Knowledge Distillation Language Modeling
— Unverified 0Topological Analysis for Detecting Anomalies (TADA) in Time Series Jun 10, 2024 Quantization Time Series
— Unverified 0Efficient Neural Compression with Inference-time Decoding Jun 10, 2024 Decoder Quantization
— Unverified 0Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks Jun 10, 2024 Quantization
— Unverified 0The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs Jun 10, 2024 Quantization RAG
— Unverified 0Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization Jun 8, 2024 Quantization Speaker Verification
— Unverified 0Spectral Codecs: Improving Non-Autoregressive Speech Synthesis with Spectrogram-Based Audio Codecs Jun 7, 2024 Quantization Speech Synthesis
— Unverified 0Activation Map-based Vector Quantization for 360-degree Image Semantic Communication Jun 7, 2024 Quantization Semantic Communication
— Unverified 0Winner-takes-all learners are geometry-aware conditional density estimators Jun 7, 2024 All Density Estimation
Code Code Available 0Proofread: Fixes All Errors with One Tap Jun 6, 2024 All Quantization
— Unverified 0BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Jun 6, 2024 Image Generation model
— Unverified 0Real-Time Spacecraft Pose Estimation Using Mixed-Precision Quantized Neural Network on COTS Reconfigurable MPSoC Jun 6, 2024 Pose Estimation Quantization
Code Code Available 0VQUNet: Vector Quantization U-Net for Defending Adversarial Atacks by Regularizing Unwanted Noise Jun 5, 2024 Adversarial Attack Quantization
— Unverified 0Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity Jun 5, 2024 GPU Quantization
— Unverified 0