| Anant-Net: Breaking the Curse of Dimensionality with Scalable and Interpretable Neural Surrogate for High-Dimensional PDEs | May 6, 2025 | GPUKolmogorov-Arnold Networks | —Unverified | 0 |
| NBF at SemEval-2025 Task 5: Light-Burst Attention Enhanced System for Multilingual Subject Recommendation | May 6, 2025 | GPURetrieval | —Unverified | 0 |
| Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving | May 6, 2025 | GPUScheduling | —Unverified | 0 |
| Quantitative Analysis of Performance Drop in DeepSeek Model Quantization | May 5, 2025 | GPUQuantization | CodeCode Available | 0 |
| RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference | May 5, 2025 | CPUGPU | —Unverified | 0 |
| QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach | May 4, 2025 | Code GenerationGPU | —Unverified | 0 |
| Sparfels: Fast Reconstruction from Sparse Unposed Imagery | May 4, 2025 | GPU | —Unverified | 0 |
| A UNet Model for Accelerated Preprocessing of CRISM Hyperspectral Data for Mineral Identification on Mars | May 4, 2025 | GPU | —Unverified | 0 |
| Phantora: Live GPU Cluster Simulation for Machine Learning System Performance Estimation | May 2, 2025 | GPU | —Unverified | 0 |
| Feature Optimization for Time Series Forecasting via Novel Randomized Uphill Climbing | May 2, 2025 | GPUMultivariate Time Series Forecasting | —Unverified | 0 |
| Efficient On-Chip Implementation of 4D Radar-Based 3D Object Detection on Hailo-8L | May 1, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Aggregating empirical evidence from data strategy studies: a case on model quantization | May 1, 2025 | GPUQuantization | —Unverified | 0 |
| Sionna RT: Technical Report | Apr 30, 2025 | GPU | —Unverified | 0 |
| Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning | Apr 29, 2025 | CPUGPU | —Unverified | 0 |
| TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models | Apr 29, 2025 | BenchmarkingDataset Generation | CodeCode Available | 0 |
| Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language | Apr 28, 2025 | Continual PretrainingGPU | —Unverified | 0 |
| semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage | Apr 28, 2025 | GPULarge Language Model | —Unverified | 0 |
| FlashOverlap: A Lightweight Design for Efficiently Overlapping Communication and Computation | Apr 28, 2025 | GPU | —Unverified | 0 |
| Accelerating Mixture-of-Experts Training with Adaptive Expert Replication | Apr 28, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AI | Apr 27, 2025 | GPU | —Unverified | 0 |
| GPU accelerated program synthesis: Enumerate semantics, not syntax! | Apr 26, 2025 | CPUGPU | —Unverified | 0 |
| Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider | Apr 26, 2025 | BenchmarkingGPU | CodeCode Available | 0 |
| The Big Send-off: High Performance Collectives on GPU-based Supercomputers | Apr 25, 2025 | GPULanguage Modeling | —Unverified | 0 |
| L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference | Apr 24, 2025 | GPU | —Unverified | 0 |
| Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification | Apr 23, 2025 | Emotion ClassificationGPU | —Unverified | 0 |