SOTAVerified

Word Clustering for Historical Newspapers Analysis

2019-09-01RANLP 2019Unverified0· sign in to hype

Lidia Pivovarova, Elaine Zosa, Jani Marjanen

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper is a part of a collaboration between computer scientists and historians aimed at development of novel tools and methods to improve analysis of historical newspapers. We present a case study of ideological terms ending with -ism suffix in nineteenth century Finnish newspapers. We propose a two-step procedure to trace differences in word usages over time: training of diachronic embeddings on several time slices and when clustering embeddings of selected words together with their neighbours to obtain historical context. The obtained clusters turn out to be useful for historical studies. The paper also discuss specific difficulties related to development historian-oriented tools.

Tasks

Reproductions