SOTAVerified

Sentence Similarity Learning by Lexical Decomposition and Composition

2016-02-23COLING 2016Code Available0· sign in to hype

Zhiguo Wang, Haitao Mi, Abraham Ittycheriah

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each word as a vector, and calculates a semantic matching vector for each word based on all words in the other sentence. Then, each word vector is decomposed into a similar component and a dissimilar component based on the semantic matching vector. After this, a two-channel CNN model is employed to capture features by composing the similar and dissimilar components. Finally, a similarity score is estimated over the composed feature vectors. Experimental results show that our model gets the state-of-the-art performance on the answer sentence selection task, and achieves a comparable result on the paraphrase identification task.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
WikiQALDCMAP0.71Unverified

Reproductions