CPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 2231 papers

Title	Date	Tasks	Status	Hype
WebLLM: A High-Performance In-Browser LLM Inference Engine	Dec 20, 2024	CPUGPU	CodeCode Available	11
Magika: AI-Powered Content-Type Detection	Sep 18, 2024	CPUMalware Analysis	CodeCode Available	11
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction	Mar 21, 2025	CPUDocument Layout Analysis	CodeCode Available	9
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models	Dec 23, 2024	CPU	CodeCode Available	9
PowerInfer-2: Fast Large Language Model Inference on a Smartphone	Jun 10, 2024	CPULanguage Modeling	CodeCode Available	9
Chinese-Vicuna: A Chinese Instruction-following Llama-based Model	Apr 17, 2025	Code GenerationCPU	CodeCode Available	7
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization	Mar 26, 2025	CPUGPU	CodeCode Available	7
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving	Jun 24, 2024	CPUGPU	CodeCode Available	7
Full Scaling Automation for Sustainable Development of Green Data Centers	May 1, 2023	Cloud ComputingCPU	CodeCode Available	7
Elixir: Train a Large Language Model on a Small GPU Cluster	Dec 10, 2022	CPUGPU	CodeCode Available	7
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5
XFeat: Accelerated Features for Lightweight Image Matching	Apr 30, 2024	CPUKeypoint detection and image matching	CodeCode Available	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU	Dec 16, 2023	CPUGPU	CodeCode Available	5
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications	Jun 25, 2023	CPUDecoder	CodeCode Available	5
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU	Mar 13, 2023	CPUGPU	CodeCode Available	5
Vectorized and performance-portable Quicksort	May 12, 2022	CPU	CodeCode Available	5
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float	Apr 15, 2025	CPUGPU	CodeCode Available	4
SocialED: A Python Library for Social Event Detection	Dec 18, 2024	CPUEvent Detection	CodeCode Available	4
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems	Oct 21, 2024	Automated Theorem ProvingCPU	CodeCode Available	4
Data-Prep-Kit: getting your data ready for LLM application development	Sep 26, 2024	CPULanguage Modeling	CodeCode Available	4
SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning	Aug 14, 2024	CPUMotion Planning	CodeCode Available	4
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge	Jun 25, 2024	Computational EfficiencyCPU	CodeCode Available	4
Look Once to Hear: Target Speech Hearing with Noisy Examples	May 10, 2024	CPUSpeech Extraction	CodeCode Available	4
Vidur: A Large-Scale Simulation Framework For LLM Inference	May 8, 2024	CPUGPU	CodeCode Available	4
Couler: Unified Machine Learning Workflow Optimization in Cloud	Mar 12, 2024	CPU	CodeCode Available	4
Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series	Jan 8, 2024	CPUFew-Shot Learning	CodeCode Available	4
FFCV: Accelerating Training by Removing Data Bottlenecks	Jun 21, 2023	CPUGPU	CodeCode Available	4
DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement	May 14, 2023	CPUSpeech Enhancement	CodeCode Available	4
DAMO-YOLO : A Report on Real-Time Object Detection Design	Nov 23, 2022	CPUNeural Architecture Search	CodeCode Available	4
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale	Jun 30, 2022	CPUGPU	CodeCode Available	4
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction	May 29, 2022	Autonomous DrivingCPU	CodeCode Available	4
PLAID: An Efficient Engine for Late Interaction Retrieval	May 19, 2022	CPUGPU	CodeCode Available	4
DeepFilterNet2: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio	May 11, 2022	CPUData Augmentation	CodeCode Available	4
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models	Mar 14, 2022	CPUQuantization	CodeCode Available	4
GPUTreeShap: Massively Parallel Exact Calculation of SHAP Scores for Tree Ensembles	Oct 27, 2020	BIG-bench Machine LearningCPU	CodeCode Available	4
FlashDMoE: Fast Distributed MoE in a Single Kernel	Jun 5, 2025	16kCPU	CodeCode Available	3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-III	Apr 8, 2025	Computational EfficiencyCPU	CodeCode Available	3
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory	Mar 16, 2025	CPUGPU	CodeCode Available	3
Nd-BiMamba2: A Unified Bidirectional Architecture for Multi-Dimensional Data Processing	Nov 22, 2024	Computational EfficiencyCPU	CodeCode Available	3
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference	Oct 28, 2024	CPU	CodeCode Available	3
MagicPIG: LSH Sampling for Efficient LLM Generation	Oct 21, 2024	CPUGPU	CodeCode Available	3
vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving	Jul 22, 2024	CPUGPU	CodeCode Available	3
Inference Performance Optimization for Large Language Models on CPUs	Jul 10, 2024	CPUGPU	CodeCode Available	3
NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU	May 12, 2024	CPUDeep Learning	CodeCode Available	3
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models	Feb 10, 2024	CPUGPU	CodeCode Available	3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices	Dec 28, 2023	AutoMLCPU	CodeCode Available	3
XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library	Dec 25, 2023	CPUDeep Reinforcement Learning	CodeCode Available	3
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3
Unlimiformer: Long-Range Transformers with Unlimited Length Input	May 2, 2023	Book summarizationCPU	CodeCode Available	3

Show:10 25 50

← PrevPage 1 of 45Next →

No leaderboard results yet.