| Accelerating MoE Model Inference with Expert Sharding | Mar 11, 2025 | DecoderGPU | —Unverified | 0 |
| LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization | Mar 11, 2025 | GPUImage Generation | CodeCode Available | 2 |
| Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices | Mar 10, 2025 | CPUGPU | —Unverified | 0 |
| Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data | Mar 10, 2025 | GPU | —Unverified | 0 |
| AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution | Mar 10, 2025 | GPUSuper-Resolution | —Unverified | 0 |
| Short-Term Load Forecasting for AI-Data Center | Mar 10, 2025 | GPULoad Forecasting | —Unverified | 0 |
| AttFC: Attention Fully-Connected Layer for Large-Scale Face Recognition with One GPU | Mar 10, 2025 | Face RecognitionGPU | —Unverified | 0 |
| Efficient Distillation of Classifier-Free Guidance using Adapters | Mar 10, 2025 | GPU | CodeCode Available | 0 |
| Global Context Is All You Need for Parallel Efficient Tractography Parcellation | Mar 10, 2025 | AllData Augmentation | —Unverified | 0 |
| A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation | Mar 9, 2025 | GPU | —Unverified | 0 |