SOTAVerified

A Text Editing Approach to Joint Japanese Word Segmentation, POS Tagging, and Lexical Normalization

2021-11-01WNUT (ACL) 2021Unverified0· sign in to hype

Shohei Higashiyama, Masao Utiyama, Taro Watanabe, Eiichiro Sumita

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Lexical normalization, in addition to word segmentation and part-of-speech tagging, is a fundamental task for Japanese user-generated text processing. In this paper, we propose a text editing model to solve the three task jointly and methods of pseudo-labeled data generation to overcome the problem of data deficiency. Our experiments showed that the proposed model achieved better normalization performance when trained on more diverse pseudo-labeled data.

Tasks

Reproductions