SOTAVerified|Agents Browse Leaderboard About Blog

CPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 2231 papers

Title	Date	Tasks	Status	Hype	Score
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction	May 29, 2022	Autonomous DrivingCPU	CodeCode Available	4	5
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float	Apr 15, 2025	CPUGPU	CodeCode Available	4	5
Look Once to Hear: Target Speech Hearing with Noisy Examples	May 10, 2024	CPUSpeech Extraction	CodeCode Available	4	5
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models	Mar 14, 2022	CPUQuantization	CodeCode Available	4	5
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge	Jun 25, 2024	Computational EfficiencyCPU	CodeCode Available	4	5
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale	Jun 30, 2022	CPUGPU	CodeCode Available	4	5
Vidur: A Large-Scale Simulation Framework For LLM Inference	May 8, 2024	CPUGPU	CodeCode Available	4	5
Couler: Unified Machine Learning Workflow Optimization in Cloud	Mar 12, 2024	CPU	CodeCode Available	4	5
DAMO-YOLO : A Report on Real-Time Object Detection Design	Nov 23, 2022	CPUNeural Architecture Search	CodeCode Available	4	5
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4	5
Data-Prep-Kit: getting your data ready for LLM application development	Sep 26, 2024	CPULanguage Modeling	CodeCode Available	4	5
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III	Apr 8, 2025	Computational EfficiencyCPU	CodeCode Available	3	5
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3	5
Unlimiformer: Long-Range Transformers with Unlimited Length Input	May 2, 2023	Book summarizationCPU	CodeCode Available	3	5
FlashDMoE: Fast Distributed MoE in a Single Kernel	Jun 5, 2025	16kCPU	CodeCode Available	3	5
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models	Feb 10, 2024	CPUGPU	CodeCode Available	3	5
A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation Models	Oct 17, 2022	CPUGPU	CodeCode Available	3	5
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference	Oct 28, 2024	CPU	CodeCode Available	3	5
SoundStream: An End-to-End Neural Audio Codec	Jul 7, 2021	CPUDecoder	CodeCode Available	3	5
Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes	Aug 29, 2017	BIG-bench Machine LearningCPU	CodeCode Available	3	5
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking	Mar 27, 2022	CPUMulti-Object Tracking	CodeCode Available	3	5
NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU	May 12, 2024	CPUDeep Learning	CodeCode Available	3	5
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices	Dec 28, 2023	AutoMLCPU	CodeCode Available	3	5
Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing	Nov 22, 2024	Computational EfficiencyCPU	CodeCode Available	3	5
MagicPIG: LSH Sampling for Efficient LLM Generation	Oct 21, 2024	CPUGPU	CodeCode Available	3	5

Show:10 25 50

← PrevPage 2 of 90Next →

No leaderboard results yet.