Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response

2020-09-18European Journal of Operational Research 2020Unverified0· sign in to hype

Hyun-Rok Lee, Taesik Lee

Unverified — Be the first to reproduce this paper.

Abstract

Disaster response operations typically involve multiple decision-makers, and each decision-maker needs to make its decisions given only incomplete information on the current situation. To account for these characteristics – decision making by multiple decision-makers with partial observations to achieve a shared objective –, we formulate the decision problem as a decentralized-partially observable Markov decision process (dec-POMDP) model. To tackle a well-known difficulty of optimally solving a dec-POMDP model, multi-agent reinforcement learning (MARL) has been used as a solution technique. However, typical MARL algorithms are not always effective to solve dec-POMDP models. Motivated by evidence in single-agent RL cases, we propose a MARL algorithm augmented by pretraining. Specifically, we use behavioral cloning (BC) as a means to pretrain a neural network. We verify the effectiveness of the proposed method by solving a dec-POMDP model for a decentralized selective patient admission problem. Experimental results of three disaster scenarios show that the proposed method is a viable solution approach to solving dec-POMDP problems and that augmenting MARL with BC for its pretraining seems to offer advantages over plain MARL in terms of solution quality and computation time.

Tasks

Decision Making Disaster Response Multi-agent Reinforcement Learning

Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response

Abstract

Tasks

Reproductions