| One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning | Jan 28, 2025 | Few-Shot LearningGPU | —Unverified | 0 |
| Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference | Jan 27, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| PISCO: Pretty Simple Compression for Retrieval-Augmented Generation | Jan 27, 2025 | GPUKnowledge Distillation | —Unverified | 0 |
| Towards Scalable Topological Regularizers | Jan 24, 2025 | Domain AdaptationGPU | —Unverified | 0 |
| GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models | Jan 22, 2025 | GPUQuantization | CodeCode Available | 0 |
| 3DGS^2: Near Second-order Converging 3D Gaussian Splatting | Jan 22, 2025 | 3DGS3D Reconstruction | —Unverified | 0 |
| Learning Versatile Optimizers on a Compute Diet | Jan 22, 2025 | GPU | CodeCode Available | 0 |
| HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation | Jan 22, 2025 | CPUGPU | —Unverified | 0 |
| Irrational Complex Rotations Empower Low-bit Optimizers | Jan 22, 2025 | GPUQuantization | —Unverified | 0 |
| Pushing the Limits of BFP on Narrow Precision LLM Inference | Jan 21, 2025 | GPU | —Unverified | 0 |