SOTAVerified

CPU

Papers

Showing 76100 of 2231 papers

TitleStatusHype
Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures0
MULTI-LF: A Unified Continuous Learning Framework for Real-Time DDoS Detection in Multi-Environment Networks0
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length FloatCode4
Understanding and Optimizing Multi-Stage AI Inference Pipelines0
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training0
aweSOM: a CPU/GPU-accelerated Self-organizing Map and Statistically Combined Ensemble Framework for Machine-learning Clustering Analysis0
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention0
Wavefront Estimation From a Single Measurement: Uniqueness and Algorithms0
Towards On-Device Learning and Reconfigurable Hardware Implementation for Encoded Single-Photon Signal Processing0
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints0
WoundAmbit: Bridging State-of-the-Art Semantic Segmentation and Real-World Wound Care0
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE InferenceCode2
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home ClustersCode0
IAEmu: Learning Galaxy Intrinsic Alignment CorrelationsCode0
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis0
Exploring energy consumption of AI frameworks on a 64-core RV64 Server CPU0
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism0
Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries0
SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching0
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models0
Solving the Best Subset Selection Problem via Suboptimal AlgorithmsCode0
GPU-centric Communication Schemes for HPC and ML Applications0
Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments0
Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion0
Show:102550
← PrevPage 4 of 90Next →

No leaderboard results yet.