| Compressing the Backward Pass of Large-Scale Neural Architectures by Structured Activation Pruning | Nov 28, 2023 | GPUimage-classification | —Unverified | 0 | 0 |
| Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment | Sep 2, 2024 | CPUGPU | —Unverified | 0 | 0 |
| Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt | May 17, 2023 | GPUModel Compression | —Unverified | 0 | 0 |
| Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead | Jun 17, 2024 | GPUModel Compression | —Unverified | 0 | 0 |
| Computational Bottlenecks of Training Small-scale Large Language Models | Oct 25, 2024 | GPULanguage Modeling | —Unverified | 0 | 0 |
| Computationally Efficient Deep Neural Network for Computed Tomography Image Reconstruction | Oct 5, 2018 | Computed Tomography (CT)GPU | —Unverified | 0 | 0 |
| Computational optimization of convolutional neural networks using separated filters architecture | Feb 18, 2020 | CPUGPU | —Unverified | 0 | 0 |
| Computational Scatter Correction for High-Resolution Flat-Panel CT Based on a Fast Monte Carlo Photon Transport Model | Jan 31, 2022 | Computed Tomography (CT)CT Reconstruction | —Unverified | 0 | 0 |
| Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference | Nov 1, 2024 | Decision MakingGaussian Processes | —Unverified | 0 | 0 |
| Compute Or Load KV Cache? Why Not Both? | Oct 4, 2024 | GPU | —Unverified | 0 | 0 |