SOTAVerified

Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach

2024-02-01Unverified0· sign in to hype

Zhiyuan Yao, Ionut Florescu, Chihoon Lee

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.

Tasks

Reproductions