SOTAVerified

Using Neural Transfer Learning for Morpho-syntactic Tagging of South-Slavic Languages Tweets

2018-08-01COLING 2018Unverified0· sign in to hype

Sara Meftah, Nasredine Semmar, Fatiha Sadat, Stephan Raaijmakers

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we describe a morpho-syntactic tagger of tweets, an important component of the CEA List DeepLIMA tool which is a multilingual text analysis platform based on deep learning. This tagger is built for the Morpho-syntactic Tagging of Tweets (MTT) Shared task of the 2018 VarDial Evaluation Campaign. The MTT task focuses on morpho-syntactic annotation of non-canonical Twitter varieties of three South-Slavic languages: Slovene, Croatian and Serbian. We propose to use a neural network model trained in an end-to-end manner for the three languages without any need for task or domain specific features engineering. The proposed approach combines both character and word level representations. Considering the lack of annotated data in the social media domain for South-Slavic languages, we have also implemented a cross-domain Transfer Learning (TL) approach to exploit any available related out-of-domain annotated data.

Tasks

Reproductions