From FreEM to D’AlemBERT: a Large Corpus and a Language Model for Early Modern French

2022-06-01LREC 2022Unverified0· sign in to hype

Simon Gabay, Pedro Ortiz Suarez, Alexandre Bartz, Alix Chagué, Rachel Bawden, Philippe Gambette, Benoît Sagot

Unverified — Be the first to reproduce this paper.

Abstract

anguage models for historical states of language are becoming increasingly important to allow the optimal digitisation and analysis of old textual sources. Because these historical states are at the same time more complex to process and more scarce in the corpora available, this paper presents recent efforts to overcome this difficult situation. These efforts include producing a corpus, creating the model, and evaluating it with an NLP task currently used by scholars in other ongoing projects.

Tasks

Language Modeling Language Modelling

From FreEM to D’AlemBERT: a Large Corpus and a Language Model for Early Modern French

Abstract

Tasks

Reproductions