That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Language evolution follows the rule of gradual change. Grammar, vocabulary, and lexical semantics shift took place over time, resulting in the diachronic linguistic gap. However, a considerable amount of texts are written in languages of different eras, which brings obstacles to natural language processing tasks, such as word segmentation and machine translation. Chinese is a language with a long history, but previous Chinese natural language processing works mainly focused on tasks in a specific era. Therefore, in this paper, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora with different eras show that the performance of each corpus obtains a significant improvement. Further analyses also demonstrate that the SM can effectively integrate the knowledge of the eras into the neural network.