Semi-supervised Sequence Learning

2015-11-04NeurIPS 2015Code Available0· sign in to hype

Andrew M. Dai, Quoc V. Le

Code Available — Be the first to reproduce this paper.

Code

github.com/autobotasia/vibert
tf★ 9
github.com/kelly2016/multi-label-bert
tf★ 7
github.com/MichaelZhouwang/LMlexsub
tf★ 3
github.com/luckynozomi/PPI_Bert
tf★ 1
github.com/LoveYang/bert_test
tf★ 1
github.com/coco60/bert-test
tf★ 1
github.com/Satan012/BERT
tf★ 0
github.com/Nstats/bert_senti_analysis_ch
tf★ 0
github.com/algharak/BERTenhance
tf★ 0
github.com/vanpersie32/Multigpu-Bert
tf★ 0

Abstract

We present two approaches that use unlabeled data to improve sequence learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a "pretraining" step for a later supervised sequence learning algorithm. In other words, the parameters obtained from the unsupervised step can be used as a starting point for other supervised training models. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better. With pretraining, we are able to train long short term memory recurrent networks up to a few hundred timesteps, thereby achieving strong performance in many text classification tasks, such as IMDB, DBpedia and 20 Newsgroups.

Tasks

Language Modeling Language Modelling Text Classification

Semi-supervised Sequence Learning

Code

Abstract

Tasks

Reproductions