Determining the Multiword Expression Inventory of a Surprise Language
2016-12-01COLING 2016Unverified0· sign in to hype
Bahar Salehi, Paul Cook, Timothy Baldwin
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Much previous research on multiword expressions (MWEs) has focused on the token- and type-level tasks of MWE identification and extraction, respectively. Such studies typically target known prevalent MWE types in a given language. This paper describes the first attempt to learn the MWE inventory of a ``surprise'' language for which we have no explicit prior knowledge of MWE patterns, certainly no annotated MWE data, and not even a parallel corpus. Our proposed model is trained on a treebank with MWE relations of a source language, and can be applied to the monolingual corpus of the surprise language to identify its MWE construction types.