Evaluating hypotheses in geolocation on a very large sample of Twitter
2017-09-01WS 2017Unverified0· sign in to hype
Bahar Salehi, Anders S{\o}gaard
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Recent work in geolocation has made several hypotheses about what linguistic markers are relevant to detect where people write from. In this paper, we examine six hypotheses against a corpus consisting of all geo-tagged tweets from the US, or whose geo-tags could be inferred, in a 19\% sample of Twitter history. Our experiments lend support to all six hypotheses, including that spelling variants and hashtags are strong predictors of location. We also study what kinds of common nouns are predictive of location after controlling for named entities such as dolphins or sharks