Multi-task learning for historical text normalization: Size matters

2018-07-01WS 2018Unverified0· sign in to hype

Marcel Bollmann, Anders S{\o}gaard, Joachim Bingel

Unverified — Be the first to reproduce this paper.

Abstract

Historical text normalization suffers from small datasets that exhibit high variance, and previous work has shown that multi-task learning can be used to leverage data from related problems in order to obtain more robust models. Previous work has been limited to datasets from a specific language and a specific historical period, and it is not clear whether results generalize. It therefore remains an open problem, when historical text normalization benefits from multi-task learning. We explore the benefits of multi-task learning across 10 different datasets, representing different languages and periods. Our main finding---contrary to what has been observed for other NLP tasks---is that multi-task learning mainly works when target task data is very scarce.

Tasks

Grammatical Error Correction Multi-Task Learning Text Generation Text Normalization

Multi-task learning for historical text normalization: Size matters

Abstract

Tasks

Reproductions