Learning Natural Language Generation with Truncated Reinforcement Learning

2022-07-01NAACL 2022Code Available0· sign in to hype

Alice Martin, Guillaume Quispe, Charles Ollion, Sylvain Le Corff, Florian Strub, Olivier Pietquin

Code Available — Be the first to reproduce this paper.

Code

github.com/amdonati/rl-nlp
OfficialIn paperpytorch★ 6

Abstract

This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original approach to train conditional languagemodels without a supervised learning phase, by only using reinforcement learning (RL). As RL methods unsuccessfully scale to large action spaces, we dynamically truncate the vocabulary space using a generic language model. TrufLL thus enables to train a language agent by solely interacting with its environment without any task-specific prior knowledge; it is only guided with a task-agnostic language model. Interestingly, this approach avoids the dependency to labelled datasets and inherently reduces pretrained policy flaws such as language or exposure biases. We evaluate TrufLL on two visual question generation tasks, for which we report positive results over performance and language metrics, which we then corroborate with a human evaluation. To our knowledge, it is the first approach that successfully learns a language generation policy without pre-training, using only reinforcement learning.

Tasks

Language Modeling Language Modelling Question Generation Question-Generation reinforcement-learning Reinforcement Learning Reinforcement Learning (RL)Text Generation

Learning Natural Language Generation with Truncated Reinforcement Learning

Code

Abstract

Tasks

Reproductions