SOTAVerified

Multi-agent reinforcement learning algorithm to solve a partially-observable multi-agent problem in disaster response

2020-09-18European Journal of Operational Research 2020Unverified0· sign in to hype

Hyun-Rok Lee, Taesik Lee

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Disaster response operations typically involve multiple decision-makers, and each decision-maker needs to make its decisions given only incomplete information on the current situation. To account for these characteristics – decision making by multiple decision-makers with partial observations to achieve a shared objective –, we formulate the decision problem as a decentralized-partially observable Markov decision process (dec-POMDP) model. To tackle a well-known difficulty of optimally solving a dec-POMDP model, multi-agent reinforcement learning (MARL) has been used as a solution technique. However, typical MARL algorithms are not always effective to solve dec-POMDP models. Motivated by evidence in single-agent RL cases, we propose a MARL algorithm augmented by pretraining. Specifically, we use behavioral cloning (BC) as a means to pretrain a neural network. We verify the effectiveness of the proposed method by solving a dec-POMDP model for a decentralized selective patient admission problem. Experimental results of three disaster scenarios show that the proposed method is a viable solution approach to solving dec-POMDP problems and that augmenting MARL with BC for its pretraining seems to offer advantages over plain MARL in terms of solution quality and computation time.

Tasks

Reproductions