What's in Your Embedding, And How It Predicts Task Performance

2018-08-01COLING 2018Unverified0· sign in to hype

Anna Rogers, Shashwath Hosur Ananthakrishna, Anna Rumshisky

Unverified — Be the first to reproduce this paper.

Abstract

Attempts to find a single technique for general-purpose intrinsic evaluation of word embeddings have so far not been successful. We present a new approach based on scaled-up qualitative analysis of word vector neighborhoods that quantifies interpretable characteristics of a given model (e.g. its preference for synonyms or shared morphological forms as nearest neighbors). We analyze 21 such factors and show how they correlate with performance on 14 extrinsic and intrinsic task datasets (and also explain the lack of correlation between some of them). Our approach enables multi-faceted evaluation, parameter search, and generally -- a more principled, hypothesis-driven approach to development of distributional semantic representations.

Tasks

Named Entity Recognition (NER)Semantic Role Labeling Word Embeddings

What's in Your Embedding, And How It Predicts Task Performance

Abstract

Tasks

Reproductions