SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Scheduling
Scheduling
Project or Job Scheduling
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 1–25 of 3104 papers
Title
Date
Tasks
Status
Hype
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Jan 2, 2025
GPU
Scheduling
Code
Code Available
9
Steering Language Models with Game-Theoretic Solvers
Jan 24, 2024
Imitation Learning
Scheduling
Code
Code Available
9
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Jun 10, 2024
CPU
Language Modeling
Code
Code Available
9
The Road Less Scheduled
May 24, 2024
Scheduling
Code
Code Available
7
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving
Nov 27, 2024
Fairness
GPU
Code
Code Available
7
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Feb 6, 2023
Scheduling
Code
Code Available
7
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models
Aug 21, 2024
GPU
Quantization
Code
Code Available
5
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Jun 4, 2025
Benchmarking
Scheduling
Code
Code Available
5
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Mar 13, 2025
Denoising
Image to text
Code
Code Available
5
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
Jan 25, 2024
GPU
Scheduling
Code
Code Available
4
FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical Training
Mar 3, 2023
Federated Learning
GPU
Code
Code Available
4
One Step Diffusion via Shortcut Models
Oct 16, 2024
Denoising
Scheduling
Code
Code Available
4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Apr 15, 2025
GPU
Inference Optimization
Code
Code Available
4
Vidur: A Large-Scale Simulation Framework For LLM Inference
May 8, 2024
CPU
GPU
Code
Code Available
4
Orion-14B: Open-source Multilingual Large Language Models
Jan 20, 2024
Scheduling
Code
Code Available
4
PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices
May 30, 2024
Scheduling
Code
Code Available
4
Fairness in Serving Large Language Models
Dec 31, 2023
Fairness
Scheduling
Code
Code Available
3
Efficiently Serving LLM Reasoning Programs with Certaindex
Dec 30, 2024
Code Generation
Mathematical Problem-Solving
Code
Code Available
3
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1
Oct 3, 2024
Scheduling
Code
Code Available
3
MNN: A Universal and Efficient Inference Engine
Feb 27, 2020
Deep Learning
Diversity
Code
Code Available
3
Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System
Apr 23, 2020
Scheduling
Code
Code Available
3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management
Oct 1, 2024
GPU
Language Modeling
Code
Code Available
3
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering
Aug 15, 2024
Computational Efficiency
Scheduling
Code
Code Available
3
A Survey on Large Language Model Acceleration based on KV Cache Management
Dec 27, 2024
Language Modeling
Language Modelling
Code
Code Available
3
A Survey on Inference Optimization Techniques for Mixture of Experts Models
Dec 18, 2024
Computational Efficiency
Distributed Computing
Code
Code Available
3
Show:
10
25
50
← Prev
Page 1 of 125
Next →
No leaderboard results yet.