SOTAVerified

Scheduling

Project or Job Scheduling

Papers

Showing 110 of 3104 papers

TitleStatusHype
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference ServingCode9
PowerInfer-2: Fast Large Language Model Inference on a SmartphoneCode9
Steering Language Models with Game-Theoretic SolversCode9
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model ServingCode7
The Road Less ScheduledCode7
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale ModelsCode7
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and MaintenanceCode5
FlowTok: Flowing Seamlessly Across Text and Image TokensCode5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language ModelsCode5
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
Show:102550
← PrevPage 1 of 311Next →

No leaderboard results yet.