SOTAVerified

Multi-Candidate Word Segmentation using Bi-directional LSTM Neural Networks

2018-05-07Code Available0· sign in to hype

Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, Thanaruk Theeramunkong

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Most existing word segmentation methods output one single segmentation solution. This paper provides an analysis of word segmentation performance when more than one solutions are taken into account. Towards this investigation, a deep neural network with multiple thresholds is applied to generate multiple candidates for segmentation. As a test-bed, the well-known bidirectional long short-term memory (BiLSTM) units are used with eleven contexts in a deep neural network. As performance indices, three measures; recall, precision and f-measure, are plotted with respect to various thresholds for both boundary level and word level evaluation. By a number of experiments, the result shows that the multi-candidate word segmentation can help us increase the recalls while maintaining the precisions.

Tasks

Reproductions