Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection May 10, 2024 Autonomous Driving GPU
— Unverified 0From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks May 9, 2024 Knowledge Distillation Model Compression
— Unverified 0LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit May 9, 2024 Benchmarking Computational Efficiency
Code Code Available 4Ditto: Quantization-aware Secure Inference of Transformers upon MPC May 9, 2024 Quantization
Code Code Available 3Custom Gradient Estimators are Straight-Through Estimators in Disguise May 8, 2024 Quantization
— Unverified 0QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving May 7, 2024 GPU Language Modelling
Code Code Available 4KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization May 7, 2024 GPU Language Modeling
— Unverified 0Compression-based Privacy Preservation for Distributed Nash Equilibrium Seeking in Aggregative Games May 6, 2024 Quantization
— Unverified 0Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer May 6, 2024 Efficient ViTs Model Compression
Code Code Available 0DeltaKWS: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM May 6, 2024 channel selection Keyword Spotting
— Unverified 0Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs May 6, 2024 Quantization
Code Code Available 1Vector Quantization for Recommender Systems: A Review and Outlook May 6, 2024 Feature Compression Quantization
Code Code Available 1PTQ4SAM: Post-Training Quantization for Segment Anything May 6, 2024 Instance Segmentation object-detection
Code Code Available 2Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment May 6, 2024 Arithmetic Reasoning Code Generation
— Unverified 0Quantifying the Capabilities of LLMs across Scale and Precision May 6, 2024 Hallucination Misinformation
— Unverified 0Joint Discrete Precoding and RIS Optimization for RIS-Assisted MU-MIMO Communication Systems May 5, 2024 Quantization
— Unverified 0Efficient Text-driven Motion Generation via Latent Consistency Training May 5, 2024 Motion Generation Quantization
Code Code Available 0Exploring Extreme Quantization in Spiking Language Models May 4, 2024 Knowledge Distillation Language Modeling
— Unverified 0Lightweight Change Detection in Heterogeneous Remote Sensing Images with Online All-Integer Pruning Training May 3, 2024 All Change Detection
— Unverified 0Three Quantization Regimes for ReLU Networks May 3, 2024 Quantization
— Unverified 0Network reconstruction via the minimum description length principle May 2, 2024 Bayesian Inference Quantization
— Unverified 0Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design May 2, 2024 Model Compression Neural Network Compression
Code Code Available 2Efficient Compression of Multitask Multilingual Speech Models May 2, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Deep Learning Models in Speech Recognition: Measuring GPU Energy Consumption, Impact of Noise and Model Quantization for Edge Deployment May 2, 2024 GPU NVIDIA Jetson Orin Nano
Code Code Available 0Joint Sequential Fronthaul Quantization and Hardware Complexity Reduction in Uplink Cell-Free Massive MIMO Networks May 2, 2024 Quantization
— Unverified 0Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications May 1, 2024 Human Detection Knowledge Distillation
— Unverified 0Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey May 1, 2024 Quantization
Code Code Available 2When Quantization Affects Confidence of Large Language Models? May 1, 2024 Language Modeling Language Modelling
Code Code Available 0Gradient-based Automatic Mixed Precision Quantization for Neural Networks On-Chip May 1, 2024 Jet Tagging Quantization
Code Code Available 1Investigating Automatic Scoring and Feedback using Large Language Models May 1, 2024 parameter-efficient fine-tuning Quantization
— Unverified 0Self-supervised Pre-training of Text Recognizers May 1, 2024 Quantization Transfer Learning
Code Code Available 0Transition Rate Scheduling for Quantization-Aware Training Apr 30, 2024 Quantization Scheduling
— Unverified 0Quantized Context Based LIF Neurons for Recurrent Spiking Neural Networks in 45nm Apr 28, 2024 Quantization
— Unverified 0Enhancing Channel Estimation in Quantized Systems with a Generative Prior Apr 26, 2024 Quantization
— Unverified 0sDAC -- Semantic Digital Analog Converter for Semantic Communications Apr 26, 2024 Quantization Semantic Communication
— Unverified 0How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training Apr 25, 2024 Quantization
— Unverified 0MMGRec: Multimodal Generative Recommendation with Transformer Model Apr 25, 2024 model Multimodal Recommendation
— Unverified 0Semantic Routing for Enhanced Performance of LLM-Assisted Intent-Based 5G Core Network Management and Orchestration Apr 24, 2024 Management Prompt Engineering
Code Code Available 7CoST: Contrastive Quantization based Semantic Tokenization for Generative Recommendation Apr 23, 2024 Decoder Language Modelling
— Unverified 0CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture Apr 22, 2024 GPU Quantization
— Unverified 0AdaQAT: Adaptive Bit-Width Quantization-Aware Training Apr 22, 2024 Quantization
— Unverified 0Latency-Distortion Tradeoffs in Communicating Classification Results over Noisy Channels Apr 22, 2024 Navigate Quantization
— Unverified 0An empirical study of LLaMA3 quantization: from LLMs to MLLMs Apr 22, 2024 Language Modelling Large Language Model
Code Code Available 2MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts Apr 22, 2024 Common Sense Reasoning GPU
Code Code Available 3FedMPQ: Secure and Communication-Efficient Federated Learning with Multi-codebook Product Quantization Apr 21, 2024 Federated Learning Quantization
— Unverified 0A SER-based Device Selection Mechanism in Multi-bits Quantization Federated Learning Apr 20, 2024 Federated Learning Quantization
— Unverified 0HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression Apr 20, 2024 Decoder Image Compression
— Unverified 0MAexp: A Generic Platform for RL-based Multi-Agent Exploration Apr 19, 2024 Diversity Multi-agent Reinforcement Learning
Code Code Available 2decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points Apr 19, 2024 Quantization
Code Code Available 2Privacy-Preserving UCB Decision Process Verification via zk-SNARKs Apr 18, 2024 Decision Making Privacy Preserving
— Unverified 0