SOTAVerified

Language Resource Addition: Dictionary or Corpus?

2014-05-01LREC 2014Unverified0· sign in to hype

Shinsuke Mori, Graham Neubig

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we investigate the relative effect of two strategies of language resource additions to the word segmentation problem and part-of-speech tagging problem in Japanese. The first strategy is adding entries to the dictionary and the second is adding annotated sentences to the training corpus. The experimental results showed that the annotated sentence addition to the training corpus is better than the entries addition to the dictionary. And the annotated sentence addition is efficient especially when we add new words with contexts of three real occurrences as partially annotated sentences. According to this knowledge, we executed annotation on the invention disclosure texts and observed word segmentation accuracy.

Tasks

Reproductions