Historical Text Normalization with Delayed Rewards
2019-07-01ACL 2019Unverified0· sign in to hype
Simon Flachs, Marcel Bollmann, Anders S{\o}gaard
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Training neural sequence-to-sequence models with simple token-level log-likelihood is now a standard approach to historical text normalization, albeit often outperformed by phrase-based models. Policy gradient training enables direct optimization for exact matches, and while the small datasets in historical text normalization are prohibitive of from-scratch reinforcement learning, we show that policy gradient fine-tuning leads to significant improvements across the board. Policy gradient training, in particular, leads to more accurate normalizations for long or unseen words.