| On the Cost and Benefits of Training Context with Utterance or Full Conversation Training: A Comparative Stud | May 12, 2025 | GPUHallucination | —Unverified | 0 |
| OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit | May 12, 2025 | GPUPrivacy Preserving | CodeCode Available | 4 |
| Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains | May 12, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers | May 12, 2025 | GPUNeural Architecture Search | —Unverified | 0 |
| Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption | May 12, 2025 | GPUKnowledge Base Question Answering | —Unverified | 0 |
| Matrix Is All You Need | May 11, 2025 | AllGPU | —Unverified | 0 |
| Streaming Krylov-Accelerated Stochastic Gradient Descent | May 11, 2025 | GPUStochastic Optimization | —Unverified | 0 |
| JaxRobotarium: Training and Deploying Multi-Robot Policies in 10 Minutes | May 10, 2025 | BenchmarkingGPU | CodeCode Available | 1 |
| QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration | May 10, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference | May 9, 2025 | CPUGPU | —Unverified | 0 |