| Balanced and Elastic End-to-end Training of Dynamic LLMs | May 20, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity | May 20, 2025 | GPULarge Language Model | CodeCode Available | 0 |
| Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers | May 20, 2025 | GPUVideo Generation | CodeCode Available | 2 |
| UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache | May 20, 2025 | 4k8k | —Unverified | 0 |
| UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models | May 20, 2025 | GPULifelong learning | CodeCode Available | 2 |
| 4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision | May 20, 2025 | Autonomous VehiclesBEV Segmentation | CodeCode Available | 0 |
| ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs | May 20, 2025 | GPULarge Language Model | —Unverified | 0 |
| Multi-head Temporal Latent Attention | May 19, 2025 | GPUspeech-recognition | CodeCode Available | 4 |
| Frozen Backpropagation: Relaxing Weight Symmetry in Temporally-Coded Deep Spiking Neural Networks | May 19, 2025 | GPU | CodeCode Available | 0 |
| Half Search Space is All You Need | May 19, 2025 | AllGPU | —Unverified | 0 |