SOTAVerified

Multilingual Universal Dependency Parsing from Raw Text with Low-Resource Language Enhancement

2018-10-01CONLL 2018Unverified0· sign in to hype

Yingting Wu, Hai Zhao, Jia-Jun Tong

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes the system of our team Phoenix for participating CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Given the annotated gold standard data in CoNLL-U format, we train the tokenizer, tagger and parser separately for each treebank based on an open source pipeline tool UDPipe. Our system reads the plain texts for input, performs the pre-processing steps (tokenization, lemmas, morphology) and finally outputs the syntactic dependencies. For the low-resource languages with no training data, we use cross-lingual techniques to build models with some close languages instead. In the official evaluation, our system achieves the macro-averaged scores of 65.61\%, 52.26\%, 55.71\% for LAS, MLAS and BLEX respectively.

Tasks

Reproductions