SOTAVerified

BUCC2020: Bilingual Dictionary Induction using Cross-lingual Embedding

2020-05-01LREC 2020Unverified0· sign in to hype

Sanjanasri JP, Vijay Krishna Menon, Soman KP

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper presents a deep learning system for the BUCC 2020 shared task: Bilingual dictionary induction from comparable corpora. We have submitted two runs for this shared Task, German (de) and English (en) language pair for ``closed track'' and Tamil (ta) and English (en) for the ``open track''. Our core approach focuses on quantifying the semantics of the language pairs, so that semantics of two different language pairs can be compared or transfer learned. With the advent of word embeddings, it is possible to quantify this. In this paper, we propose a deep learning approach which makes use of the supplied training data, to generate cross-lingual embedding. This is later used for inducting bilingual dictionary from comparable corpora.

Tasks

Reproductions