| BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures | Jun 6, 2025 | BenchmarkingCPU | —Unverified | 0 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 |
| Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts | Jun 2, 2025 | CPU | —Unverified | 0 |
| PointODE: Lightweight Point Cloud Learning with Neural Ordinary Differential Equations on Edge | May 31, 2025 | CPU | —Unverified | 0 |
| Fast Feature Matching of UAV Images via Matrix Band Reduction-based GPU Data Schedule | May 28, 2025 | CPUGPU | —Unverified | 0 |
| Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs | May 28, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs | May 28, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization | May 26, 2025 | CPUGPU | CodeCode Available | 1 |
| TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis | May 25, 2025 | CPUGPU | —Unverified | 0 |
| FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization | May 25, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Why Not Replace? Sustaining Long-Term Visual Localization via Handcrafted-Learned Feature Collaboration on CPU | May 24, 2025 | CPUKeypoint Detection | CodeCode Available | 1 |
| QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design | May 22, 2025 | CPUGPU | CodeCode Available | 2 |
| Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models | May 21, 2025 | AllCPU | CodeCode Available | 0 |
| KernelOracle: Predicting the Linux Scheduler's Next Move with Deep Learning | May 21, 2025 | CPUDeep Learning | CodeCode Available | 0 |
| Harnessing Large Language Models Locally: Empirical Results and Implications for AI PC | May 21, 2025 | CPUQuantization | CodeCode Available | 0 |
| Machine Learning for Consistency Violation Faults Analysis | May 20, 2025 | CPU | —Unverified | 0 |
| FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference | May 19, 2025 | CPUGPU | —Unverified | 0 |
| ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates | May 18, 2025 | CPUGPU | —Unverified | 0 |
| MPRM: A Markov Path-based Rule Miner for Efficient and Interpretable Knowledge Graph Reasoning | May 18, 2025 | CPUKnowledge Graphs | —Unverified | 0 |
| A Heuristic Algorithm Based on Beam Search and Iterated Local Search for the Maritime Inventory Routing Problem | May 17, 2025 | CPU | —Unverified | 0 |
| Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets | May 16, 2025 | CPUScheduling | —Unverified | 0 |
| From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification | May 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices | May 15, 2025 | CPUGPU | CodeCode Available | 1 |
| Lossless Compression for LLM Tensor Incremental Snapshots | May 14, 2025 | CPU | —Unverified | 0 |
| Single-shot prediction of parametric partial differential equations | May 14, 2025 | CPUGPU | —Unverified | 0 |