SOTAVerified

CPU

Papers

Showing 501550 of 2231 papers

TitleStatusHype
MPRM: A Markov Path-based Rule Miner for Efficient and Interpretable Knowledge Graph Reasoning0
A Heuristic Algorithm Based on Beam Search and Iterated Local Search for the Maritime Inventory Routing Problem0
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets0
From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification0
Single-shot prediction of parametric partial differential equations0
Lossless Compression for LLM Tensor Incremental Snapshots0
Benchmarking of CPU-intensive Stream Data Processing in The Edge Computing Systems0
Bang for the Buck: Vector Search on Cloud CPUs0
FloE: On-the-Fly MoE Inference on Memory-constrained GPU0
Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference0
Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration0
Plexus: Taming Billion-edge Graphs with 3D Parallel GNN Training0
Supporting renewable energy planning and operation with data-driven high-resolution ensemble weather forecast0
The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning0
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference0
Spill The Beans: Exploiting CPU Cache Side-Channels to Leak Tokens from Large Language Models0
GPRat: Gaussian Process Regression with Asynchronous TasksCode0
Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning0
Accelerated 3D-3D rigid registration of echocardiographic images obtained from apical window using particle filter0
GPU accelerated program synthesis: Enumerate semantics, not syntax!0
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration0
Dynamic Superblock Pruning for Fast Learned Sparse Retrieval0
Blockchain Meets Adaptive Honeypots: A Trust-Aware Approach to Next-Gen IoT Security0
ThyroidEffi 1.0: A Cost-Effective System for High-Performance Multi-Class Thyroid Carcinoma Classification0
MetaDSE: A Few-shot Meta-learning Framework for Cross-workload CPU Design Space Exploration0
NNTile: a machine learning framework capable of training extremely large GPT language models on a single node0
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures0
BitNet b1.58 2B4T Technical Report0
MULTI-LF: A Unified Continuous Learning Framework for Real-Time DDoS Detection in Multi-Environment Networks0
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training0
Understanding and Optimizing Multi-Stage AI Inference Pipelines0
Wavefront Estimation From a Single Measurement: Uniqueness and Algorithms0
aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis0
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention0
Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing0
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints0
WoundAmbit: Bridging State-of-the-Art Semantic Segmentation and Real-World Wound Care0
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home ClustersCode0
IAEmu: Learning Galaxy Intrinsic Alignment CorrelationsCode0
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis0
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism0
Exploring energy consumption of AI frameworks on a 64-core RV64 Server CPU0
Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries0
SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching0
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models0
Solving the Best Subset Selection Problem via Suboptimal AlgorithmsCode0
GPU-centric Communication Schemes for HPC and ML Applications0
Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments0
Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion0
Robust DNN Partitioning and Resource Allocation Under Uncertain Inference Time0
Show:102550
← PrevPage 11 of 45Next →

No leaderboard results yet.