SOTAVerified

Neural Edit Operations for Biological Sequences

2018-12-01NeurIPS 2018Unverified0· sign in to hype

Satoshi Koide, Keisuke Kawano, Takuro Kutsuna

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

The evolution of biological sequences, such as proteins or DNAs, is driven by the three basic edit operations: substitution, insertion, and deletion. Motivated by the recent progress of neural network models for biological tasks, we implement two neural network architectures that can treat such edit operations. The first proposal is the edit invariant neural networks, based on differentiable Needleman-Wunsch algorithms. The second is the use of deep CNNs with concatenations. Our analysis shows that CNNs can recognize star-free regular expressions, and that deeper CNNs can recognize more complex regular expressions including the insertion/deletion of characters. The experimental results for the protein secondary structure prediction task suggest the importance of insertion/deletion. The test accuracy on the widely-used CB513 dataset is 71.5%, which is 1.2-points better than the current best result on non-ensemble models.

Tasks

Reproductions