Jmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity

2017-08-01SEMEVAL 2017Code Available0· sign in to hype

Josu{\'e} Melka, Gilles Bernard

Code Available — Be the first to reproduce this paper.

Code

github.com/yoch/jmp8
OfficialIn papernone★ 0

Abstract

We have built a simple corpus-based system to estimate words similarity in multiple languages with a count-based approach. After training on Wikipedia corpora, our system was evaluated on the multilingual subtask of SemEval-2017 Task 2 and achieved a good level of performance, despite its great simplicity. Our results tend to demonstrate the power of the distributional approach in semantic similarity tasks, even without knowledge of the underlying language. We also show that dimensionality reduction has a considerable impact on the results.

Tasks

Dimensionality Reduction Semantic Similarity Semantic Textual Similarity Task 2 Word Similarity

Jmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity

Code

Abstract

Tasks

Reproductions