SOTAVerified

Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture

2022-01-16ACL ARR January 2022Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The large pre-training language model GPT-2 has been fine-tuned in task-oriented dialog system and achieved state-of-the-art performance on many datasets. However, there's few work of reinforcement learning on these GPT-2 based dialog systems, not to mention designing a GPT-2 based user simulator. In this paper, we propose a dialog system and user simulator based on GPT-2 with simplified generative architecture for reinforcement learning. The experiments are conducted on MultiWOZ2.1 and we evaluate our system with an offline method and online method respectively. The results show that our dialog system achieves the best performance among all the GPT-2 based models even without RL optimization and the performance of the model is further improved after RL. We also explore different reward settings in RL and provide deep analysis of how the model attends to different information and how RL improve the performance of dialog system.

Tasks

Reproductions