SOTAVerified

Identifying DNA Sequence Motifs Using Deep Learning

2023-11-20Code Available1· sign in to hype

Asmita Poddar, Vladimir Uzun, Elizabeth Tunbridge, Wilfried Haerty, Alejo Nevado-Holgado

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Splice sites play a crucial role in gene expression, and accurate prediction of these sites in DNA sequences is essential for diagnosing and treating genetic disorders. We address the challenge of splice site prediction by introducing DeepDeCode, an attention-based deep learning sequence model to capture the long-term dependencies in the nucleotides in DNA sequences. We further propose using visualization techniques for accurate identification of sequence motifs, which enhance the interpretability and trustworthiness of DeepDeCode. We compare DeepDeCode to other state-of-the-art methods for splice site prediction and demonstrate its accuracy, explainability and efficiency. Given the results of our methodology, we expect that it can used for healthcare applications to reason about genomic processes and be extended to discover new splice sites and genomic regulatory elements.

Tasks

Reproductions