| Mesh-Learner: Texturing Mesh with Spherical Harmonics | Apr 28, 2025 | 3D ReconstructionCPU | CodeCode Available | 1 |
| Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language | Apr 28, 2025 | Continual PretrainingGPU | —Unverified | 0 |
| FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation | Apr 28, 2025 | GPU | —Unverified | 0 |
| Taming the Titans: A Survey of Efficient LLM Inference Serving | Apr 28, 2025 | GPUMiscellaneous | CodeCode Available | 1 |
| Accelerating Mixture-of-Experts Training with Adaptive Expert Replication | Apr 28, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage | Apr 28, 2025 | GPULarge Language Model | —Unverified | 0 |
| NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AI | Apr 27, 2025 | GPU | —Unverified | 0 |
| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 |
| GPU accelerated program synthesis: Enumerate semantics, not syntax! | Apr 26, 2025 | CPUGPU | —Unverified | 0 |
| The Big Send-off: High Performance Collectives on GPU-based Supercomputers | Apr 25, 2025 | GPULanguage Modeling | —Unverified | 0 |