SOTAVerified

Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale

2024-03-01Unverified0· sign in to hype

Emile Anand, Guannan Qu

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study reinforcement learning for global decision-making in the presence of local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the joint rewards of all the agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing challenge due to the size of the state space which can be exponential in the number of agents. This work proposes the SUBSAMPLE-Q algorithm where the global agent subsamples k n local agents to compute a policy in time that is polynomial in k. We show that this learned policy converges to the optimal policy in the order of O(1/k+_k,m) as the number of sub-sampled agents k increases, where _k,m is the Bellman noise. Finally, we validate the theory through numerical simulations in a demand-response setting and a queueing setting.

Tasks

Reproductions