| Benchmarking of CPU-intensive Stream Data Processing in The Edge Computing Systems | May 12, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Bang for the Buck: Vector Search on Cloud CPUs | May 12, 2025 | CPUQuantization | —Unverified | 0 |
| Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference | May 9, 2025 | CPUGPU | —Unverified | 0 |
| Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates | May 9, 2025 | Audio SynthesisCPU | CodeCode Available | 1 |
| FloE: On-the-Fly MoE Inference on Memory-constrained GPU | May 9, 2025 | CPUGPU | —Unverified | 0 |
| Plexus: Taming Billion-edge Graphs with 3D Parallel GNN Training | May 7, 2025 | CPUGPU | —Unverified | 0 |
| Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration | May 7, 2025 | CPUFace Detection | —Unverified | 0 |
| Supporting renewable energy planning and operation with data-driven high-resolution ensemble weather forecast | May 7, 2025 | CPUGPU | —Unverified | 0 |
| The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning | May 6, 2025 | CPUGaussian Processes | —Unverified | 0 |
| RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference | May 5, 2025 | CPUGPU | —Unverified | 0 |
| Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression | May 3, 2025 | CPU | CodeCode Available | 1 |
| Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models | May 1, 2025 | CPU | —Unverified | 0 |
| GPRat: Gaussian Process Regression with Asynchronous Tasks | Apr 30, 2025 | C++ codeCPU | CodeCode Available | 0 |
| Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning | Apr 29, 2025 | CPUGPU | —Unverified | 0 |
| Accelerated 3D-3D rigid registration of echocardiographic images obtained from apical window using particle filter | Apr 28, 2025 | CPUTemporal Sequences | —Unverified | 0 |
| Mesh-Learner: Texturing Mesh with Spherical Harmonics | Apr 28, 2025 | 3D ReconstructionCPU | CodeCode Available | 1 |
| GPU accelerated program synthesis: Enumerate semantics, not syntax! | Apr 26, 2025 | CPUGPU | —Unverified | 0 |
| On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration | Apr 24, 2025 | CPUModel Compression | —Unverified | 0 |
| Dynamic Superblock Pruning for Fast Learned Sparse Retrieval | Apr 23, 2025 | CPURetrieval | —Unverified | 0 |
| Blockchain Meets Adaptive Honeypots: A Trust-Aware Approach to Next-Gen IoT Security | Apr 22, 2025 | Anomaly DetectionCPU | —Unverified | 0 |
| ThyroidEffi 1.0: A Cost-Effective System for High-Performance Multi-Class Thyroid Carcinoma Classification | Apr 19, 2025 | CPUDiagnostic | —Unverified | 0 |
| MetaDSE: A Few-shot Meta-learning Framework for Cross-workload CPU Design Space Exploration | Apr 18, 2025 | CPUMeta-Learning | —Unverified | 0 |
| NNTile: a machine learning framework capable of training extremely large GPT language models on a single node | Apr 17, 2025 | CPUGPU | —Unverified | 0 |
| Chinese-Vicuna: A Chinese Instruction-following Llama-based Model | Apr 17, 2025 | Code GenerationCPU | CodeCode Available | 7 |
| BitNet b1.58 2B4T Technical Report | Apr 16, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures | Apr 16, 2025 | CPUGPU | —Unverified | 0 |
| MULTI-LF: A Unified Continuous Learning Framework for Real-Time DDoS Detection in Multi-Environment Networks | Apr 15, 2025 | CPU | —Unverified | 0 |
| 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float | Apr 15, 2025 | CPUGPU | CodeCode Available | 4 |
| Understanding and Optimizing Multi-Stage AI Inference Pipelines | Apr 14, 2025 | CPUNavigate | —Unverified | 0 |
| OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training | Apr 14, 2025 | CPU | —Unverified | 0 |
| aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis | Apr 13, 2025 | CPUGPU | —Unverified | 0 |
| Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention | Apr 13, 2025 | CPUHighlight Detection | —Unverified | 0 |
| Wavefront Estimation From a Single Measurement: Uniqueness and Algorithms | Apr 13, 2025 | CPU | —Unverified | 0 |
| Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints | Apr 12, 2025 | CPUGPU | —Unverified | 0 |
| WoundAmbit: Bridging State-of-the-Art Semantic Segmentation and Real-World Wound Care | Apr 8, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III | Apr 8, 2025 | Computational EfficiencyCPU | CodeCode Available | 3 |
| HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference | Apr 8, 2025 | CPUGPU | CodeCode Available | 2 |
| PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters | Apr 7, 2025 | CPUGPU | CodeCode Available | 0 |
| IAEmu: Learning Galaxy Intrinsic Alignment Correlations | Apr 7, 2025 | CPUPosition | CodeCode Available | 0 |
| Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis | Apr 4, 2025 | CPUGPU | —Unverified | 0 |
| Exploring energy consumption of AI frameworks on a 64-core RV64 Server CPU | Apr 3, 2025 | CPU | —Unverified | 0 |
| MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism | Apr 3, 2025 | CPUGPU | —Unverified | 0 |
| Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries | Apr 2, 2025 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching | Apr 1, 2025 | Computational EfficiencyCPU | —Unverified | 0 |
| SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models | Apr 1, 2025 | CPUGPU | —Unverified | 0 |
| Solving the Best Subset Selection Problem via Suboptimal Algorithms | Mar 31, 2025 | CPU | CodeCode Available | 0 |
| GPU-centric Communication Schemes for HPC and ML Applications | Mar 31, 2025 | CPUGPU | —Unverified | 0 |
| Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments | Mar 31, 2025 | CPUGPU | —Unverified | 0 |
| Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion | Mar 29, 2025 | CPU | —Unverified | 0 |