Evaluating Informal-Domain Word Representations With UrbanDictionary

2016-06-27WS 2016Code Available0· sign in to hype

Naomi Saphra, Adam Lopez

Code Available — Be the first to reproduce this paper.

Code

github.com/nsaphra/urbandic-scraper
OfficialIn papernone★ 0

Abstract

Existing corpora for intrinsic evaluation are not targeted towards tasks in informal domains such as Twitter or news comment forums. We want to test whether a representation of informal words fulfills the promise of eliding explicit text normalization as a preprocessing step. One possible evaluation metric for such domains is the proximity of spelling variants. We propose how such a metric might be computed and how a spelling variant dataset can be collected using UrbanDictionary.

Tasks

Text Normalization

Evaluating Informal-Domain Word Representations With UrbanDictionary

Code

Abstract

Tasks

Reproductions