SOTAVerified

AGILe: The First Lemmatizer for Ancient Greek Inscriptions

2022-06-01LREC 2022Unverified0· sign in to hype

Evelien de Graaf, Silvia Stopponi, Jasper K. Bos, Saskia Peels-Matthey, Malvina Nissim

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

To facilitate corpus searches by classicists as well as to reduce data sparsity when training models, we focus on the automatic lemmatization of ancient Greek inscriptions, which have not received as much attention in this sense as literary text data has. We show that existing lemmatizers for ancient Greek, trained on literary data, are not performant on epigraphic data, due to major language differences between the two types of texts. We thus train the first inscription-specific lemmatizer achieving above 80% accuracy, and make both the models and the lemmatized data available to the community. We also provide a detailed error analysis highlighting peculiarities of inscriptions which again highlights the importance of a lemmatizer dedicated to inscriptions.

Tasks

Reproductions