SOTAVerified

To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies

2019-09-01WS 2019Unverified0· sign in to hype

Dirk V{\"a}th, Ngoc Thang Vu

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we explore state-of-the-art deep reinforcement learning methods for dialog policy training such as prioritized experience replay, double deep Q-Networks, dueling network architectures and distributional learning. Our main findings show that each individual method improves the rewards and the task success rate but combining these methods in a Rainbow agent, which performs best across tasks and environments, is a non-trivial task. We, therefore, provide insights about the influence of each method on the combination and how to combine them to form a Rainbow agent.

Tasks

Reproductions