SOTAVerified

CPU

Papers

Showing 2650 of 2231 papers

TitleStatusHype
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures0
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts0
PointODE: Lightweight Point Cloud Learning with Neural Ordinary Differential Equations on Edge0
Fast Feature Matching of UAV Images via Matrix Band Reduction-based GPU Data Schedule0
Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs0
CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs0
TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache OptimizationCode1
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis0
FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization0
Why Not Replace? Sustaining Long-Term Visual Localization via Handcrafted-Learned Feature Collaboration on CPUCode1
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-DesignCode2
Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert ModelsCode0
KernelOracle: Predicting the Linux Scheduler's Next Move with Deep LearningCode0
Harnessing Large Language Models Locally: Empirical Results and Implications for AI PCCode0
Machine Learning for Consistency Violation Faults Analysis0
FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference0
ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates0
MPRM: A Markov Path-based Rule Miner for Efficient and Interpretable Knowledge Graph Reasoning0
A Heuristic Algorithm Based on Beam Search and Iterated Local Search for the Maritime Inventory Routing Problem0
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets0
From Embeddings to Accuracy: Comparing Foundation Models for Radiographic Classification0
SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained DevicesCode1
Lossless Compression for LLM Tensor Incremental Snapshots0
Single-shot prediction of parametric partial differential equations0
Show:102550
← PrevPage 2 of 90Next →

No leaderboard results yet.