| Preble: Efficient Distributed Prompt Scheduling for LLM Serving | May 8, 2024 | GPUScheduling | CodeCode Available | 2 |
| WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting | May 1, 2024 | Scheduling | CodeCode Available | 2 |
| FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation | Apr 19, 2024 | DecoderNetwork Embedding | CodeCode Available | 2 |
| MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs | Mar 28, 2024 | AI AgentMinecraft | CodeCode Available | 2 |
| Characterization of Large Language Model Development in the Datacenter | Mar 12, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Learning to Solve Job Shop Scheduling under Uncertainty | Mar 4, 2024 | Combinatorial OptimizationDeep Reinforcement Learning | CodeCode Available | 2 |
| DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent) | Jan 16, 2024 | Scheduling | CodeCode Available | 2 |
| DPoser: Diffusion Model as Robust 3D Human Pose Prior | Dec 9, 2023 | DenoisingHuman Mesh Recovery | CodeCode Available | 2 |
| mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUs | Dec 5, 2023 | GPULarge Language Model | CodeCode Available | 2 |
| Zero Bubble Pipeline Parallelism | Nov 30, 2023 | Scheduling | CodeCode Available | 2 |
| SkiROS2: A skill-based Robot Control Platform for ROS | Jun 29, 2023 | SchedulingTask Planning | CodeCode Available | 2 |
| evosax: JAX-based Evolution Strategies | Dec 8, 2022 | CPUScheduling | CodeCode Available | 2 |
| Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs | Oct 18, 2022 | Deep LearningScheduling | CodeCode Available | 2 |
| Human-in-the-Loop Large-Scale Predictive Maintenance of Workstations | Jun 23, 2022 | Active LearningScheduling | CodeCode Available | 2 |
| BMInf: An Efficient Toolkit for Big Model Inference and Tuning | May 1, 2022 | CPUGPU | CodeCode Available | 2 |
| TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs | Mar 28, 2022 | CPUGPU | CodeCode Available | 2 |
| ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning | Dec 11, 2021 | Deep Reinforcement LearningGPU | CodeCode Available | 2 |
| ChaCha for Online AutoML | Jun 9, 2021 | AutoMLScheduling | CodeCode Available | 2 |
| ConsumerBench: Benchmarking Generative AI Applications on End-User Devices | Jun 21, 2025 | BenchmarkingCPU | CodeCode Available | 1 |
| All is Not Lost: LLM Recovery without Checkpoints | Jun 18, 2025 | AllScheduling | CodeCode Available | 1 |
| A Production Scheduling Framework for Reinforcement Learning Under Real-World Constraints | Jun 16, 2025 | Job Shop SchedulingReinforcement Learning (RL) | CodeCode Available | 1 |
| Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games | Jun 5, 2025 | Action GenerationAsynchronous Group Communication | CodeCode Available | 1 |
| Diagonal Batching Unlocks Parallelism in Recurrent Memory Transformers for Long Contexts | Jun 5, 2025 | GPUScheduling | CodeCode Available | 1 |
| Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility | May 27, 2025 | 3DGSScheduling | CodeCode Available | 1 |
| Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs | May 26, 2025 | Computational EfficiencyScheduling | CodeCode Available | 1 |
| Task Memory Engine: Spatial Memory for Robust Multi-Step LLM Agents | May 26, 2025 | Scheduling | CodeCode Available | 1 |
| Structured Reinforcement Learning for Combinatorial Decision-Making | May 25, 2025 | Combinatorial OptimizationDecision Making | CodeCode Available | 1 |
| Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMs | May 20, 2025 | Scheduling | CodeCode Available | 1 |
| GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy | May 18, 2025 | Cloud ComputingDeep Reinforcement Learning | CodeCode Available | 1 |
| FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge | May 17, 2025 | Image GenerationScheduling | CodeCode Available | 1 |
| Taming the Titans: A Survey of Efficient LLM Inference Serving | Apr 28, 2025 | GPUMiscellaneous | CodeCode Available | 1 |
| Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference Serving | Apr 10, 2025 | GPULarge Language Model | CodeCode Available | 1 |
| Injecting Adrenaline into LLM Serving: Boosting Resource Utilization and Throughput via Attention Disaggregation | Mar 26, 2025 | Large Language ModelScheduling | CodeCode Available | 1 |
| Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization | Mar 24, 2025 | NavigateScheduling | CodeCode Available | 1 |
| SkyLadder: Better and Faster Pretraining via Context Window Scheduling | Mar 19, 2025 | 8kScheduling | CodeCode Available | 1 |
| SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks | Feb 27, 2025 | Scheduling | CodeCode Available | 1 |
| Starjob: Dataset for LLM-Driven Job Shop Scheduling | Feb 26, 2025 | Combinatorial OptimizationJob Shop Scheduling | CodeCode Available | 1 |
| Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling | Feb 18, 2025 | Combinatorial OptimizationJob Shop Scheduling | CodeCode Available | 1 |
| The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training | Jan 31, 2025 | Scheduling | CodeCode Available | 1 |
| An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem | Jan 23, 2025 | DenoisingScheduling | CodeCode Available | 1 |
| CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Jan 14, 2025 | Deep Reinforcement LearningGPU | CodeCode Available | 1 |
| Dynamics-incorporated Modeling Framework for Stability Constrained Scheduling Under High-penetration of Renewable Energy | Jan 10, 2025 | Scheduling | CodeCode Available | 1 |
| Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks | Dec 24, 2024 | Scheduling | CodeCode Available | 1 |
| Brain-to-Text Benchmark '24: Lessons Learned | Dec 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Multi Agent Reinforcement Learning for Sequential Satellite Assignment Problems | Dec 20, 2024 | Combinatorial OptimizationMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Neural Combinatorial Optimization for Stochastic Flexible Job Shop Scheduling Problems | Dec 18, 2024 | Combinatorial OptimizationJob Shop Scheduling | CodeCode Available | 1 |
| Grid: Omni Visual Generation | Dec 14, 2024 | Image GenerationScheduling | CodeCode Available | 1 |
| From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection | Dec 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Digital Transformation in the Water Distribution System based on the Digital Twins Concept | Dec 9, 2024 | Decision MakingScheduling | CodeCode Available | 1 |
| Robust Planning with Compound LLM Architectures: An LLM-Modulo Approach | Nov 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |