Topic Aware Neural Language Model: Domain Adaptation of Unconditional Text Generation Models

2021-09-29Unverified0· sign in to hype

Noriaki Kawamae

Unverified — Be the first to reproduce this paper.

Abstract

Our goal is to adapt pre-trained neural language models (NLMs) to the unconditional text generation task within the target domain. Because many Transformer based NLMs are trained on more massive and heterogeneous corpora than this target domain, the difference between these corpora and the target domain raises the question of whether these NLMs can provide their benefits to this task even after the fine-tuning. To tackle these problems, our approach focuses on topics to bridge the semantic gap between these corpora and the target domain corpus, and relates them at a topic level. That is, this approach injects topics into these NLMs and trains them via topics behind these dependencies over segments, introducing both topic alignment (TA) and training tasks (TDM and TEM), while previous Transformer based NLMs are better at learning from the predefined segment length such as the context. Experiments show that this approach contributes to resolve the imbalance between these corpora, and can tailor previous pre-trained NLMs to generate coherent and semantically valid text reflecting a given small fine-tuning corpus.

Tasks

Domain Adaptation Language Modeling Language Modelling Text Generation valid

Topic Aware Neural Language Model: Domain Adaptation of Unconditional Text Generation Models

Abstract

Tasks

Reproductions