The HIT-SCIR System for End-to-End Parsing of Universal Dependencies
Wanxiang Che, Jiang Guo, Yuxuan Wang, Bo Zheng, Huaipeng Zhao, Yang Liu, Dechuan Teng, Ting Liu
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper describes our system (HIT-SCIR) for the CoNLL 2017 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system includes three pipelined components: tokenization, Part-of-Speech (POS) tagging and dependency parsing. We use character-based bidirectional long short-term memory (LSTM) networks for both tokenization and POS tagging. Afterwards, we employ a list-based transition-based algorithm for general non-projective parsing and present an improved Stack-LSTM-based architecture for representing each transition state and making predictions. Furthermore, to parse low/zero-resource languages and cross-domain data, we use a model transfer approach to make effective use of existing resources. We demonstrate substantial gains against the UDPipe baseline, with an average improvement of 3.76\% in LAS of all languages. And finally, we rank the 4th place on the official test sets.