SOTAVerified

Determining the Multiword Expression Inventory of a Surprise Language

2016-12-01COLING 2016Unverified0· sign in to hype

Bahar Salehi, Paul Cook, Timothy Baldwin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Much previous research on multiword expressions (MWEs) has focused on the token- and type-level tasks of MWE identification and extraction, respectively. Such studies typically target known prevalent MWE types in a given language. This paper describes the first attempt to learn the MWE inventory of a ``surprise'' language for which we have no explicit prior knowledge of MWE patterns, certainly no annotated MWE data, and not even a parallel corpus. Our proposed model is trained on a treebank with MWE relations of a source language, and can be applied to the monolingual corpus of the surprise language to identify its MWE construction types.

Tasks

Reproductions