SOTAVerified

Scheduling

Project or Job Scheduling

Papers

Showing 526550 of 3104 papers

TitleStatusHype
NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM InferenceCode0
Enhancing Adaptive Mixed-Criticality Scheduling with Deep Reinforcement Learning0
Multi-Agent Deep Q-Network with Layer-based Communication Channel for Autonomous Internal Logistics Vehicle Scheduling in Smart Manufacturing0
DynaSplit: A Hardware-Software Co-Design Framework for Energy-Aware Inference on Edge0
ALISE: Accelerating Large Language Model Serving with Speculative Scheduling0
EF-LLM: Energy Forecasting LLM with AI-assisted Automation, Enhanced Sparse Prediction, Hallucination Detection0
Automatic programming via large language models with population self-evolution for dynamic job shop scheduling problem0
Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem0
Bayesian Counterfactual Prediction Models for HIV Care Retention with Incomplete Outcome and Covariate Information0
How Does Critical Batch Size Scale in Pre-training?Code1
Carbon-Aware Computing for Data Centers with Probabilistic Performance Guarantees0
Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness0
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity0
Capacity-Aware Planning and Scheduling in Budget-Constrained Monotonic MDPs: A Meta-RL Approach0
Age of Information-Oriented Probabilistic Link Scheduling for Device-to-Device Networks0
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-DesignCode1
Applying Neural Monte Carlo Tree Search to Unsignalized Multi-intersection Scheduling for Autonomous Vehicles0
Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes0
Fast Inference for Augmented Large Language Models0
Exploiting Data Centres and Local Energy Communities Synergies for Market Participation0
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers0
A Surrogate Model for Quay Crane Scheduling Problem0
AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost0
Show:102550
← PrevPage 22 of 125Next →

No leaderboard results yet.