SOTAVerified

CPU

Papers

Showing 4150 of 2231 papers

TitleStatusHype
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden IntermediatesCode3
SoundStream: An End-to-End Neural Audio CodecCode3
Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded ModesCode3
Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data ProcessingCode3
NGD-SLAM: Towards Real-Time Dynamic SLAM without GPUCode3
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object TrackingCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation ModelsCode3
Inference Performance Optimization for Large Language Models on CPUsCode3
Show:102550
← PrevPage 5 of 224Next →

No leaderboard results yet.