Revisiting Tri-training of Dependency Parsers

2021-09-16EMNLP 2021Code Available0· sign in to hype

Joachim Wagner, Jennifer Foster

Code Available — Be the first to reproduce this paper.

Code

github.com/jowagner/ud-combination
OfficialIn papernone★ 2
github.com/jowagner/mtb-tri-training
OfficialIn papertf★ 2

Abstract

We compare two orthogonal semi-supervised learning techniques, namely tri-training and pretrained word embeddings, in the task of dependency parsing. We explore language-specific FastText and ELMo embeddings and multilingual BERT embeddings. We focus on a low resource scenario as semi-supervised learning can be expected to have the most impact here. Based on treebank size and available ELMo models, we select Hungarian, Uyghur (a zero-shot language for mBERT) and Vietnamese. Furthermore, we include English in a simulated low-resource setting. We find that pretrained word embeddings make more effective use of unlabelled data than tri-training but that the two approaches can be successfully combined.

Tasks

Dependency Parsing Word Embeddings

Revisiting Tri-training of Dependency Parsers

Code

Abstract

Tasks

Reproductions