SOTAVerified

Distant Reading in Digital Humanities: Case Study on the Serbian Part of the ELTeC Collection

2022-06-01LREC 2022Unverified0· sign in to hype

Ranka Stanković, Cvetana Krstev, Branislava Šandrih Todorović, Dusko Vitas, Mihailo Skoric, Milica Ikonić Nešić

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper we present the Serbian part of the ELTeC multilingual corpus of novels written in the time period 1840-1920. The corpus is being built in order to test various distant reading methods and tools with the aim of re-thinking the European literary history. We present the various steps that led to the production of the Serbian sub-collection: the novel selection and retrieval, text preparation, structural annotation, POS-tagging, lemmatization and named entity recognition. The Serbian sub-collection was published on different platforms in order to make it freely available to various users. Several use examples show that this sub-collection is usefull for both close and distant reading approaches.

Tasks

Reproductions