SOTAVerified

Imitation Learning for Sentence Generation with Dilated Convolutions Using Adversarial Training

2019-08-152019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) 2019Code Available0· sign in to hype

Jian-Wei Peng, Min-Chun Hu, Chuan-Wang Chang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this work, we consider the sentence generation problem as an imitation learning problem, which aims to learn a policy to mimic the expert. Recent works have showed that adversarial learning can be applied to imitation learning problems. However, it has been indicated that the reward signal from the discriminator is not robust in reinforcement learning (RL) based generative adversarial network (GAN), and estimating state-action value is usually computationally intractable. To deal with this problem, we propose to use two discriminators to provide two different reward signals for constructing a more general imitation learning framework that can be used for sequence generation. Monte Carlo (MC) rollout is therefore not necessary to make our algorithm computationally tractable for generating long sequences. Furthermore, our policy and discriminator networks are integrated by sharing another state encoder network constructed based on dilated convolutions instead of recurrent neural networks (RNNs). In our experiment, we show that the two reward signals control the trade-off between the quality and the diversity of the output sequences.

Tasks

Reproductions