Scheduling

Project or Job Scheduling

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 3104 papers

Title	Date	Tasks	Status	Hype	Score
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving	Jan 2, 2025	GPUScheduling	CodeCode Available	9	5
PowerInfer-2: Fast Large Language Model Inference on a Smartphone	Jun 10, 2024	CPULanguage Modeling	CodeCode Available	9	5
Steering Language Models with Game-Theoretic Solvers	Jan 24, 2024	Imitation LearningScheduling	CodeCode Available	9	5
The Road Less Scheduled	May 24, 2024	Scheduling	CodeCode Available	7	5
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving	Nov 27, 2024	FairnessGPU	CodeCode Available	7	5
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models	Feb 6, 2023	Scheduling	CodeCode Available	7	5
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance	Jun 4, 2025	BenchmarkingScheduling	CodeCode Available	5	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5	5
FlowTok: Flowing Seamlessly Across Text and Image Tokens	Mar 13, 2025	DenoisingImage to text	CodeCode Available	5	5
Orion-14B: Open-source Multilingual Large Language Models	Jan 20, 2024	Scheduling	CodeCode Available	4	5
Vidur: A Large-Scale Simulation Framework For LLM Inference	May 8, 2024	CPUGPU	CodeCode Available	4	5
FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical Training	Mar 3, 2023	Federated LearningGPU	CodeCode Available	4	5
One Step Diffusion via Shortcut Models	Oct 16, 2024	DenoisingScheduling	CodeCode Available	4	5
PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices	May 30, 2024	Scheduling	CodeCode Available	4	5
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints	Apr 15, 2025	GPUInference Optimization	CodeCode Available	4	5
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models	Jan 25, 2024	GPUScheduling	CodeCode Available	4	5
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1	Oct 3, 2024	Scheduling	CodeCode Available	3	5
MNN: A Universal and Efficient Inference Engine	Feb 27, 2020	Deep LearningDiversity	CodeCode Available	3	5
FlashDMoE: Fast Distributed MoE in a Single Kernel	Jun 5, 2025	16kCPU	CodeCode Available	3	5
Fairness in Serving Large Language Models	Dec 31, 2023	FairnessScheduling	CodeCode Available	3	5
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering	Aug 15, 2024	Computational EfficiencyScheduling	CodeCode Available	3	5
Efficiently Serving LLM Reasoning Programs with Certaindex	Dec 30, 2024	Code GenerationMathematical Problem-Solving	CodeCode Available	3	5
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management	Oct 1, 2024	GPULanguage Modeling	CodeCode Available	3	5
A Survey on Large Language Model Acceleration based on KV Cache Management	Dec 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System	Apr 23, 2020	Scheduling	CodeCode Available	3	5
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve	Mar 4, 2024	GPUScheduling	CodeCode Available	3	5
Vine Copulas as Differentiable Computational Graphs	Jun 16, 2025	GPUScheduling	CodeCode Available	3	5
A Survey on Inference Optimization Techniques for Mixture of Experts Models	Dec 18, 2024	Computational EfficiencyDistributed Computing	CodeCode Available	3	5
Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule	May 12, 2025	Drug DesignScheduling	CodeCode Available	2	5
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning	Jun 23, 2025	GPULarge Language Model	CodeCode Available	2	5
MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs	Mar 28, 2024	AI AgentMinecraft	CodeCode Available	2	5
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing	Jun 10, 2024	SchedulingVideo Editing	CodeCode Available	2	5
Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services	Jun 27, 2024	Scheduling	CodeCode Available	2	5
Learning to Solve Job Shop Scheduling under Uncertainty	Mar 4, 2024	Combinatorial OptimizationDeep Reinforcement Learning	CodeCode Available	2	5
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning	Aug 20, 2024	DenoisingImage Generation	CodeCode Available	2	5
Preble: Efficient Distributed Prompt Scheduling for LLM Serving	May 8, 2024	GPUScheduling	CodeCode Available	2	5
Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow	Jun 3, 2024	GPULanguage Modeling	CodeCode Available	2	5
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World	Mar 31, 2025	Robot ManipulationScheduling	CodeCode Available	2	5
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs	Oct 18, 2022	Deep LearningScheduling	CodeCode Available	2	5
FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation	Apr 19, 2024	DecoderNetwork Embedding	CodeCode Available	2	5
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering	Jun 10, 2025	Scheduling	CodeCode Available	2	5
Human-in-the-Loop Large-Scale Predictive Maintenance of Workstations	Jun 23, 2022	Active LearningScheduling	CodeCode Available	2	5
Efficient LLM Scheduling by Learning to Rank	Aug 28, 2024	BlockingChatbot	CodeCode Available	2	5
ChaCha for Online AutoML	Jun 9, 2021	AutoMLScheduling	CodeCode Available	2	5
EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting	Jun 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
BMInf: An Efficient Toolkit for Big Model Inference and Tuning	May 1, 2022	CPUGPU	CodeCode Available	2	5
ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning	Dec 11, 2021	Deep Reinforcement LearningGPU	CodeCode Available	2	5
Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents	May 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)	Jan 16, 2024	Scheduling	CodeCode Available	2	5
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms	Feb 21, 2025	Scheduling	CodeCode Available	2	5

Show:10 25 50

← PrevPage 1 of 63Next →

No leaderboard results yet.