SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Scheduling
Scheduling
Project or Job Scheduling
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 1–50 of 3104 papers
Title
Date
Tasks
Status
Hype
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Jun 10, 2024
CPU
Language Modeling
Code
Code Available
9
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Jan 2, 2025
GPU
Scheduling
Code
Code Available
9
Steering Language Models with Game-Theoretic Solvers
Jan 24, 2024
Imitation Learning
Scheduling
Code
Code Available
9
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Feb 6, 2023
Scheduling
Code
Code Available
7
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving
Nov 27, 2024
Fairness
GPU
Code
Code Available
7
The Road Less Scheduled
May 24, 2024
Scheduling
Code
Code Available
7
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Mar 13, 2025
Denoising
Image to text
Code
Code Available
5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models
Aug 21, 2024
GPU
Quantization
Code
Code Available
5
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance
Jun 4, 2025
Benchmarking
Scheduling
Code
Code Available
5
Orion-14B: Open-source Multilingual Large Language Models
Jan 20, 2024
Scheduling
Code
Code Available
4
Vidur: A Large-Scale Simulation Framework For LLM Inference
May 8, 2024
CPU
GPU
Code
Code Available
4
FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical Training
Mar 3, 2023
Federated Learning
GPU
Code
Code Available
4
One Step Diffusion via Shortcut Models
Oct 16, 2024
Denoising
Scheduling
Code
Code Available
4
PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices
May 30, 2024
Scheduling
Code
Code Available
4
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
Jan 25, 2024
GPU
Scheduling
Code
Code Available
4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints
Apr 15, 2025
GPU
Inference Optimization
Code
Code Available
4
Planning in Strawberry Fields: Evaluating and Improving the Planning and Scheduling Capabilities of LRM o1
Oct 3, 2024
Scheduling
Code
Code Available
3
Efficiently Serving LLM Reasoning Programs with Certaindex
Dec 30, 2024
Code Generation
Mathematical Problem-Solving
Code
Code Available
3
A Survey on Inference Optimization Techniques for Mixture of Experts Models
Dec 18, 2024
Computational Efficiency
Distributed Computing
Code
Code Available
3
A Survey on Large Language Model Acceleration based on KV Cache Management
Dec 27, 2024
Language Modeling
Language Modelling
Code
Code Available
3
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering
Aug 15, 2024
Computational Efficiency
Scheduling
Code
Code Available
3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management
Oct 1, 2024
GPU
Language Modeling
Code
Code Available
3
Vine Copulas as Differentiable Computational Graphs
Jun 16, 2025
GPU
Scheduling
Code
Code Available
3
Fairness in Serving Large Language Models
Dec 31, 2023
Fairness
Scheduling
Code
Code Available
3
Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System
Apr 23, 2020
Scheduling
Code
Code Available
3
FlashDMoE: Fast Distributed MoE in a Single Kernel
Jun 5, 2025
16k
CPU
Code
Code Available
3
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Mar 4, 2024
GPU
Scheduling
Code
Code Available
3
MNN: A Universal and Efficient Inference Engine
Feb 27, 2020
Deep Learning
Diversity
Code
Code Available
3
Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule
May 12, 2025
Drug Design
Scheduling
Code
Code Available
2
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning
Jun 23, 2025
GPU
Large Language Model
Code
Code Available
2
Characterization of Large Language Model Development in the Datacenter
Mar 12, 2024
GPU
Language Modeling
Code
Code Available
2
Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services
Jun 27, 2024
Scheduling
Code
Code Available
2
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
Jun 10, 2024
Scheduling
Video Editing
Code
Code Available
2
MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs
Mar 28, 2024
AI Agent
Minecraft
Code
Code Available
2
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World
Mar 31, 2025
Robot Manipulation
Scheduling
Code
Code Available
2
ChaCha for Online AutoML
Jun 9, 2021
AutoML
Scheduling
Code
Code Available
2
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Aug 20, 2024
Denoising
Image Generation
Code
Code Available
2
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
May 8, 2024
GPU
Scheduling
Code
Code Available
2
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms
Feb 21, 2025
Scheduling
Code
Code Available
2
Learning to Solve Job Shop Scheduling under Uncertainty
Mar 4, 2024
Combinatorial Optimization
Deep Reinforcement Learning
Code
Code Available
2
Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs
Oct 18, 2022
Deep Learning
Scheduling
Code
Code Available
2
Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow
Jun 3, 2024
GPU
Language Modeling
Code
Code Available
2
Human-in-the-Loop Large-Scale Predictive Maintenance of Workstations
Jun 23, 2022
Active Learning
Scheduling
Code
Code Available
2
mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs
Dec 5, 2023
GPU
Large Language Model
Code
Code Available
2
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Apr 8, 2025
CPU
GPU
Code
Code Available
2
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering
Jun 10, 2025
Scheduling
Code
Code Available
2
evosax: JAX-based Evolution Strategies
Dec 8, 2022
CPU
Scheduling
Code
Code Available
2
ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
Dec 11, 2021
Deep Reinforcement Learning
GPU
Code
Code Available
2
BMInf: An Efficient Toolkit for Big Model Inference and Tuning
May 1, 2022
CPU
GPU
Code
Code Available
2
FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation
Apr 19, 2024
Decoder
Network Embedding
Code
Code Available
2
Show:
10
25
50
← Prev
Page 1 of 63
Next →
No leaderboard results yet.