| Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis | Mar 3, 2024 | 3D Parameter-Efficient Fine-Tuning for ClassificationGPU | CodeCode Available | 2 |
| LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization | Mar 2, 2024 | GPUQuantization | CodeCode Available | 1 |
| Parallel Hyperparameter Optimization Of Spiking Neural Network | Mar 1, 2024 | Bayesian OptimizationGPU | CodeCode Available | 0 |
| CollaFuse: Navigating Limited Resources and Privacy in Collaborative Generative AI | Feb 29, 2024 | Autonomous DrivingDenoising | CodeCode Available | 0 |
| Efficient Lifelong Model Evaluation in an Era of Rapid Progress | Feb 29, 2024 | BenchmarkingGPU | CodeCode Available | 1 |
| DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Feb 29, 2024 | GPU | CodeCode Available | 4 |
| FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning | Feb 29, 2024 | GPULanguage Modeling | CodeCode Available | 5 |
| WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis | Feb 29, 2024 | DiversityGPU | CodeCode Available | 2 |
| FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization | Feb 28, 2024 | GPUQuantization | —Unverified | 0 |
| JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability | Feb 27, 2024 | GPUInformation Retrieval | CodeCode Available | 0 |