zNLP: Identifying Parallel Sentences in Chinese-English Comparable Corpora

2017-08-01WS 2017Unverified0· sign in to hype

Zheng Zhang, Pierre Zweigenbaum

Unverified — Be the first to reproduce this paper.

Abstract

This paper describes the zNLP system for the BUCC 2017 shared task. Our system identifies parallel sentence pairs in Chinese-English comparable corpora by translating word-by-word Chinese sentences into English, using the search engine Solr to select near-parallel sentences and then by using an SVM classifier to identify true parallel sentences from the previous results. It obtains an F1-score of 45\% (resp. 32\%) on the test (training) set.

Tasks

Machine Translation Sentence

zNLP: Identifying Parallel Sentences in Chinese-English Comparable Corpora

Abstract

Tasks

Reproductions