SOTAVerified

Translation Memory Retrieval Using Lucene

2021-09-01RANLP 2021Unverified0· sign in to hype

Kwang-Hyok Kim, Myong-ho Cho, Chol-ho Ryang, Ju-song Im, Song-yong Cho, Yong-jun Han

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Translation Memory (TM) system, a major component of computer-assisted translation (CAT), is widely used to improve human translators’ productivity by making effective use of previously translated resource. We propose a method to achieve high-speed retrieval from a large translation memory by means of similarity evaluation based on vector model, and present the experimental result. Through our experiment using Lucene, an open source information retrieval search engine, we conclude that it is possible to achieve real-time retrieval speed of about tens of microseconds even for a large translation memory with 5 million segment pairs.

Tasks

Reproductions